iqbal-lab-org / clockwork

CRyPTIC data processing pipelines
MIT License
31 stars 22 forks source link

AssertionError: When I run clockwork remove_contam command #116

Closed Huicen01 closed 1 year ago

Huicen01 commented 1 year ago

I use the 'Walkthrough: scripts only.' option, and my command is:

singularity exec clockwork_v0.11.3.img clockwork map_reads --unsorted_sam ./SAMEA104027390 ./Ref.remove_contam/ref.fa SAMEA104027390.sam ./SAMEA104027390/ERR1950064_1.fastq.gz ./SAMEA104027390/ERR1950064_2.fastq.gz ./SAMEA104027390/ERR1950065_1.fastq.gz ./SAMEA104027390/ERR1950065_2.fastq.gz when I ran clockwork map_reads command, it worked properly!

singularity exec clockwork_v0.11.3.img clockwork remove_contam ./Ref.remove_contam/remove_contam_metadata.tsv ./SAMEA104027390.sam ./SAMEA104027390.decontam.counts.tsv ./SAMEA104027390.decontam_1.fq.gz ./SAMEA104027390.decontam_2.fq.gz When I ran clockwork remove_contam command, I have encountered the following error, how can I solve it? Traceback (most recent call last): File "/usr/local/bin/clockwork", line 4, in import('pkg_resources').run_script('clockwork==0.11.3', 'clockwork') File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 667, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 1470, in run_script exec(script_code, namespace, namespace) File "/usr/local/lib/python3.8/dist-packages/clockwork-0.11.3-py3.8.egg/EGG-INFO/scripts/clockwork", line 1019, in File "/usr/local/lib/python3.8/dist-packages/clockwork-0.11.3-py3.8.egg/clockwork/tasks/remove_contam.py", line 17, in run File "/usr/local/lib/python3.8/dist-packages/clockwork-0.11.3-py3.8.egg/clockwork/contam_remover.py", line 173, in run AssertionError

PS: For some reason, I have to choose 'Walkthrough: scripts only', and 'Walkthrough: database and nextflow' cannot be selected! Thank you all for your help!

martinghunt commented 1 year ago

Thanks for reporting, I've reproduced the error. I know how to fix, will do it next week. In the meantime, a workaround would be to run the mapping and decontam separately on each pair of fastq files. Then cat the two decontam forwards fastqs into one file, and similarly for the reverse fastqs (catting in the same order!)

Huicen01 commented 1 year ago

Thank you very much for your reply! I'm not sure about the workaround you mentioned. Take the sample SAMEA104027390 (including two runs: ERR1950064 and ERR1950065) as an example. What you mean is that I do mapping and decontam for ERR1950064 and ERR1950065 respectively, assuming that a total of 4 files are generated ERR1950064.decontam_1.fq.gz, ERR1950064.decontam_2.fq.gz, ERR1950065.decontam_1. fq.gz, ERR1950065.decontam_2.fq.gz. Subsequently, ERR1950064.decontam_1.fq.gz and ERR1950065.decontam_1.fq.gz are merged to produce a new file, and ERR1950064.decontam_2.fq.gz and ERR1950065.decontam_2.fq.gz are merged to produce a new file, producing two files as input to the next clockwork variant_call_one_sample?

martinghunt commented 1 year ago

yes, what you've described is exactly what I was thinking

Huicen01 commented 1 year ago

Thank you so much!

martinghunt commented 1 year ago

You're welcome :)

The fix is now in the new clockwork release version 0.12.2, so should all work now as intended.