jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
120 stars 15 forks source link

Error when using BAM file as input #86

Closed llvisser closed 4 years ago

llvisser commented 4 years ago

Hi, I want to perform adapter trimming on some BAM files and I selected Atropos for this, as it has been described this tool can take BAM files as input. However, thus far, I have not been able to run it successfully using a BAM file as input (with fastq files it works perfectly). Below you find the code I used, accompanied by the resulting error. I also attached the BAM file I used in this example. In this example BAM file, every line includes the adapter sequence. Could you may be point out to my how I can overcome this error?

Many thanks for your help.

Kind regards, Lindy Visser


EXAMPLE 1: input BAM, output BAM

$ atropos -a RA5=GATCGTCGGACTGTAGAACTCTGAAC -o trimmed.bam -se TM249_trunc2.bam --report-file summary.txt

2019-11-19 08:48:44,557 INFO: This is Atropos 1.1.22 with Python 3.6.1 2019-11-19 08:48:46,267 INFO: Loading list of known contaminants from https://raw.githubusercontent.com/jdidion/atropos/master/atropos/adapters/sequencing_adapters.fa 2019-11-19 08:48:46,481 ERROR: Error executing command trim Traceback (most recent call last): File "/hpc/local/CentOS7/gen/lib/python3.6/site-packages/atropos/commands/base.py", line 332, in run self.return_code = self() File "/hpc/local/CentOS7/gen/lib/python3.6/site-packages/atropos/commands/trim/init.py", line 521, in call formatters.add_seq_formatter(NoFilter, output1, output2) File "/hpc/local/CentOS7/gen/lib/python3.6/site-packages/atropos/commands/trim/writers.py", line 111, in add_seq_formatter file1, file2, self.seq_formatter_args) File "/hpc/local/CentOS7/gen/lib/python3.6/site-packages/atropos/io/seqio.py", line 1000, in create_seq_formatter seq_format = get_format(file1, kwargs) File "/hpc/local/CentOS7/gen/lib/python3.6/site-packages/atropos/io/seqio.py", line 1071, in get_format "'fastq').".format(file_format)) atropos.io.seqio.UnknownFileType: File format 'bam' is unknown (expected 'fasta' or 'fastq').

EXAMPLE 2: input BAM, output fastq

$ atropos -a RA5=GATCGTCGGACTGTAGAACTCTGAAC -o trimmed.fastq -se TM249_trunc2.bam --report-file summary.txt

======= Atropos

Atropos version: 1.1.22 Python version: 3.6.1 Command line parameters: trim -a RA5=GATCGTCGGACTGTAGAACTCTGAAC -o trimmed.fastq -se TM249_trunc2.bam --report-file summary.txt

Sample ID: TM249_trunc2 Input format: SAM, Read 1, w/ Qualities Input files: /hpc/pmc_gen/lvisser2/atropos_test/short_bam/TM249_trunc2.bam

Start time: 2019-11-19T09:05:34.123961 Wallclock time: 0.01 s CPU time (main process): 0.01 s

No reads processed! Either your ...

TM249_trunc2.bam.zip

jdidion commented 4 years ago

Thanks for reporting this. Right now Atropos v1.1 only supports reading from SAM/BAM, not writing. The code for writing to SAM/BAM is in the develop branch, but I need to do some work to get it ready to release.

llvisser commented 4 years ago

Thank you for your quick response. In the second example I provided, I used BAM for reading and specified a fastq for writing. This resulted in a complete run outputting a report file that stated "No reads processed". However, as the BAM contained sequences including the adapter sequence, I here would expect a trimmed.fastq file as output. How can I obtain a trimmed output file in this example?

llvisser commented 4 years ago

Of course. See attachment.

Op do 21 nov. 2019 om 20:06 schreef John Didion notifications@github.com:

I see, thanks. Are you able to provide a minimal BAM file that reproduces the issue?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jdidion/atropos/issues/86?email_source=notifications&email_token=AFZAV6OBHRMEN6TMAFLA4Q3QU3L2BA5CNFSM4JPGYI5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3J5FQ#issuecomment-557227670, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFZAV6JJZOJKFMDTIHJRZQLQU3L2BANCNFSM4JPGYI5A .

jdidion commented 4 years ago

Sorry, I missed that the first time. Thanks.

jdidion commented 4 years ago

Do you have pysam installed? It is an optional dependency and so requires you to either install it manually (pip install pysam) or include it as an extra when installing atropos (pip install atropos[pysam]).

jdidion commented 4 years ago

(ignore the previous comment about sorting - I see the BAM is single-end reads)

jdidion commented 4 years ago

Ok - sorry for all the confusion. I have fixed the issue and will put out a new version. To clarify:

jdidion commented 4 years ago

Fixed in v1.1.23.

plijnzaad commented 4 years ago

PS: can this issue be re-opened please? It's not resolved and is more difficult to find this way. Cheers, Philip

jdidion commented 4 years ago

Thanks! I've merged your PR.