DominikBuchner / apscale

Advanced Pipeline for Simple yet Comprehensive AnaLysEs of DNA metabarcoding data
https://pypi.org/project/apscale/
MIT License
13 stars 5 forks source link

"pandas.error" when "primer_trimming" #16

Closed francielleholtz closed 1 month ago

francielleholtz commented 2 months ago

Hello, me again. I am trying the primer_trimming, but I keep getting the error below. I followed your recommendations from a different issue, changing the "I" to "N" in my primers, as well as downgrading the cutadapt, but nothing has worked. Do you have any insights about what could be causing the problem?

Thank you again for your help :)

Screenshot 2024-09-02 at 15 49 46 Screenshot 2024-09-02 at 15 49 53
DominikBuchner commented 2 months ago

Can you please upload your settings.xlsx file?

francielleholtz commented 2 months ago

Here it is: Settings_TEN1AP.xlsx

DominikBuchner commented 2 months ago

Are you sure the primer sequences are correct? Is anything written at all?

francielleholtz commented 2 months ago

Those are the primers: reverse TAIACYTCIGGRTGICCRAARAAYCA forward GGWACWRGWTGRACWITITAYCCYCC I have changed the Is to Ns, written on the table.

DominikBuchner commented 2 months ago

Okay, please write me a mail to my work address you can find on our website. I'll send you a shareable folder where you can upload the data so I can take a look. I'm not sure this is a user or apscale related problem.

best Dominik

francielleholtz commented 2 months ago

Ok, I just emailed you :) Thank you for your help!

DominikBuchner commented 1 month ago

So: I cannot reproduce the problem, but I saw that your output says 3 input files. You uploaded only 2 for the PE mering, so I was wondering: Is there more data, you did not upload, or did you move the data in between processing steps?

francielleholtz commented 1 month ago

Thank you for the message. It says 3 files because it gets all the files in the pe_merging folder: the 2 files from the data + the PE merging output. Even if I change that, the error remains the same.

Em qui., 5 de set. de 2024 às 14:14, DominikBuchner < @.***> escreveu:

So: I cannot reproduce the problem, but I saw that your output says 3 input files. You uploaded only 2 for the PE mering, so I was wondering: Is there more data, you did not upload, or did you move the data in between processing steps?

— Reply to this email directly, view it on GitHub https://github.com/DominikBuchner/apscale/issues/16#issuecomment-2331366716, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZN7PJPRQU3QY7GSQESM6F3ZVBDMDAVCNFSM6AAAAABNQMOZHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRGM3DMNZRGY . You are receiving this because you authored the thread.Message ID: @.***>

DominikBuchner commented 1 month ago

The initial files go into 2_demultiplexing/data. The result of PE merging will then be written to 3_pe_merging/data and will be the input of primer trimming and so on.

francielleholtz @.***> schrieb am Do., 5. Sept. 2024, 18:56:

Thank you for the message. It says 3 files because it gets all the files in the pe_merging folder: the 2 files from the data + the PE merging output. Even if I change that, the error remains the same.

Em qui., 5 de set. de 2024 às 14:14, DominikBuchner < @.***> escreveu:

So: I cannot reproduce the problem, but I saw that your output says 3 input files. You uploaded only 2 for the PE mering, so I was wondering: Is there more data, you did not upload, or did you move the data in between processing steps?

— Reply to this email directly, view it on GitHub < https://github.com/DominikBuchner/apscale/issues/16#issuecomment-2331366716>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AZN7PJPRQU3QY7GSQESM6F3ZVBDMDAVCNFSM6AAAAABNQMOZHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRGM3DMNZRGY>

. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/DominikBuchner/apscale/issues/16#issuecomment-2332219370, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJH6ILFU3SKUHLYIAD6NCXLZVCEMJAVCNFSM6AAAAABNQMOZHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZSGIYTSMZXGA . You are receiving this because you commented.Message ID: @.***>

francielleholtz commented 1 month ago

Thank you for the clarification. I was doing it like that before and still trying, but unfortunately, I keep getting the same error. I think I will try to install it again.

(thesis) @.*** TEN1AP_apscale % apscale --primer_trimming

13:36:55: Starting to trim primers of 1 input files.

joblib.externals.loky.process_executor._RemoteTraceback:

"""

Traceback (most recent call last):

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/externals/loky/process_executor.py", line 463, in _process_worker

r = call_item()

    ^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/externals/loky/process_executor.py", line 291, in call

return self.fn(*self.args, **self.kwargs)

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 598, in call

return [func(*args, **kwargs)

        ^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/apscale/c_primer_trimming.py", line 45, in primer_trimming

log_df = pd.read_csv(StringIO(f.stdout.decode("ascii",

errors="ignore")), sep="\t")

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv

return _read(filepath_or_buffer, kwds)

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read

parser = TextFileReader(filepath_or_buffer, **kwds)

         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in init

self._engine = self._make_engine(f, self.engine)

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine

return mapping[engine](f, **self.options)

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in init

self._reader = parsers.TextReader(src, **kwds)

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.cinit

pandas.errors.EmptyDataError: No columns to parse from file

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "/Applications/anaconda3/envs/thesis/bin/apscale", line 8, in

sys.exit(main()) ^^^^^^ File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/apscale/__main__.py", line 137, in main c_primer_trimming.main() File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/apscale/c_primer_trimming.py", line 153, in main Parallel(n_jobs=cores)( File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 2007, in __call__ return output if self.return_generator else list(output) ^^^^^^^^^^^^ File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 1650, in _get_outputs yield from self._retrieve() File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 1754, in _retrieve self._raise_error_fast() File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 1789, in _raise_error_fast error_job.get_result(self.timeout) File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 745, in get_result return self._return_or_raise() ^^^^^^^^^^^^^^^^^^^^^^^ File "/Applications/anaconda3/envs/thesis/lib/python3.12/site-packages/joblib/parallel.py", line 763, in _return_or_raise raise self._result pandas.errors.EmptyDataError: No columns to parse from file (thesis) ***@***.*** TEN1AP_apscale % Em qui., 5 de set. de 2024 às 22:13, DominikBuchner < ***@***.***> escreveu: > The initial files go into 2_demultiplexing/data. The result of PE merging > will then be written to 3_pe_merging/data and will be the input of primer > trimming and so on. > > francielleholtz ***@***.***> schrieb am Do., 5. Sept. 2024, > 18:56: > > > Thank you for the message. > > It says 3 files because it gets all the files in the pe_merging folder: > > the > > 2 files from the data + the PE merging output. Even if I change that, > the > > error remains the same. > > > > Em qui., 5 de set. de 2024 às 14:14, DominikBuchner < > > ***@***.***> escreveu: > > > > > So: I cannot reproduce the problem, but I saw that your output says 3 > > > input files. You uploaded only 2 for the PE mering, so I was > wondering: > > Is > > > there more data, you did not upload, or did you move the data in > between > > > processing steps? > > > > > > — > > > Reply to this email directly, view it on GitHub > > > < > > > https://github.com/DominikBuchner/apscale/issues/16#issuecomment-2331366716>, > > > > > > or unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/AZN7PJPRQU3QY7GSQESM6F3ZVBDMDAVCNFSM6AAAAABNQMOZHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRGM3DMNZRGY> > > > > > > . > > > You are receiving this because you authored the thread.Message ID: > > > ***@***.***> > > > > > > > — > > Reply to this email directly, view it on GitHub > > < > https://github.com/DominikBuchner/apscale/issues/16#issuecomment-2332219370>, > > > or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/AJH6ILFU3SKUHLYIAD6NCXLZVCEMJAVCNFSM6AAAAABNQMOZHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZSGIYTSMZXGA> > > > . > > You are receiving this because you commented.Message ID: > > ***@***.***> > > > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
DominikBuchner commented 1 month ago

For anyone experiencing this issue: Settings the core to process to 1 fixes this for now. Will be fixed in a future update!