longread_umi error when splitting fastq file into 10 tmp using GUN parallel, making subsequent analysis impossible

lisiang commented 2 years ago

longread_umi_nanopore_pipeline_log_2021-10-29-172655.txt

The reads_tf.fq file is empty and the software does not work.

SorenKarst commented 2 years ago

Hi Lisiang,

Thank you for trying out our pipeline.

The pipeline breaks when cutadapt attempts to filter on the chunked files. There is a hint in line 48 of the log, where it says "No reads processed!". This results in an empty output file and nothing downstream is run.

I think I encountered this error during development, and I simply can't remember the reason or the fix.

Can you try to head -n 12 of the input file and the 1_filt.tmp file and paste the results in this thread?

You might also try to install and run the data with the development branch of the pipeline: https://github.com/SorenKarst/longread_umi/tree/develop

best regards Søren

SorenKarst commented 2 years ago

Hi again Lisiang,

I tried installing the longread_umi pipeline on new system and I got a similar error to yours.

For my case the reason is porechop execution stopping as it can't find the custom 'adapters.py' found in $CONDA_PREFIX/longread_umi/scripts/adapters.py. The reason for that is that path is not in the PYTHONPATH anymore (newer version of conda maybe?).

A quick check to make is activating the longread_umi environemnt and then trying to run porechop independently.

A quick fix is to have porechop use the default adapters.py in the installation. Go to $CONDA_PREFIX/lib/python3.6/site-packages/porechop and edit porechop.py. Line 27 should be replaced by 'from .adapters import ADAPTERS, make_full_native_barcode_adapter,\'

I hope this helps.

regards Soren

lisiang commented 2 years ago

Hi again Lisiang,

I tried installing the longread_umi pipeline on new system and I got a similar error to yours.

For my case the reason is porechop execution stopping as it can't find the custom 'adapters.py' found in $CONDA_PREFIX/longread_umi/scripts/adapters.py. The reason for that is that path is not in the PYTHONPATH anymore (newer version of conda maybe?).

A quick check to make is activating the longread_umi environemnt and then trying to run porechop independently.

A quick fix is to have porechop use the default adapters.py in the installation. Go to $CONDA_PREFIX/lib/python3.6/site-packages/porechop and edit porechop.py. Line 27 should be replaced by 'from .adapters import ADAPTERS, make_full_native_barcode_adapter,'

I hope this helps.

regards Soren

Hi SorenKarst

Thanks for the reply.

At that time, I got this issue in a specific situation, for some reason, my FASTQ file size was shown as 8Gb, but it was actually an empty file. So, after I replaced the FASTQ file, the pipeline worked fine.

Meanwhile, I found another problem when using the pipeline, it seems that I can't set the mismatch of umi sequence when using umi_bin. Will this mismatch feature be available in a later update?

Sorry for taking so long to respond, and thank you for your patience.

best regards Siang

SorenKarst / longread_umi

longread_umi error when splitting fastq file into 10 tmp using GUN parallel, making subsequent analysis impossible #48