jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
120 stars 15 forks source link

Add option to configure pysam options when reading SAM/BAM file. #74

Closed okartal closed 5 years ago

okartal commented 5 years ago

This is the header of my SAM file (BS-seq reads):

@HD VN:1.0 SO:unknown GO:query @PG ID:pheniqs PN:pheniqs CL:pheniqs mux --config analysis/src/conf_multiplex.json --report test.json --output test.sam VN:2.0.4@RG ID:ACAGTG LB:cn1 PU:ACAGTG SM:At Aj 3 @RG ID:CGATGT LB:cn1 PU:CGATGT SM:At Aj 1@RG ID:TGACCA LB:cn1 PU:TGACCA SM:At Aj 2@RG ID:undetermined PU:undetermined 700523F:121:CB0L6ANXX:1:1103:1118:2223 77 0 0 0 0 GATGGNAAATTAAATTAAGAGAGGATTTAAATAATAATGATTAATTGTAGAGTTGAGGTAAGTTTTTATTTGATTTTTTAAAGGTTATATGTTTTTTTTAAAATAATTTATTTTTTTATTTTATAT BABBB#0@0FEGGGGGGG1;1//:EFDGGGGGEGGGG1?FFGGGG1FF1FGGFGG11?F1@FGGGGGGGGCFFGGGGGF1:@EGFE<F1FGGGGG<E0<0>;E0<@F0DEF0<C.0=D0C=F68 RG:Z:undetermined BC:Z:CGTCAA QT:Z:<>::<B 700523F:121:CB0L6ANXX:1:1103:1118:2223 141 0 0 0 0 ACTTTCCTTTTACTATAATTTTTTAAATTTTATAATCTTTATCTCTTTAATTAATTTTCTTTTACCACTTTTTTTTTCTCATCATTTTCATTCTCACATTTCTAAACATTTTACTATATATTTTCT BB@BBC;11=@FGGFGGG1;FDGGGGCGGGGFF11FFGGG1E1:1EFDF1=D1CFEGG>GG1=:==11C:FG1/9/E1<11==000C0000000=0<@000000<D0000=0:/</00//// RG:Z:undetermined BC:Z:CGTCAA QT:Z:<>::<B

(the small hd after the at symbol is a github markdown bug)

When I do atropos trim -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT --bisulfite truseq --op-order GAWCQ -o test.trimmed1.fastq -p test.trimmed2.fastq --format sam -l test.sam

I get: 2018-11-16 08:53:04,476 INFO: This is Atropos 1.1.19 with Python 3.6.6 2018-11-16 08:53:04,488 ERROR: Error executing command: trimTraceback (most recent call last): File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/commands/init.py", line 217, in execute_cli retcode, summary = command.execute(args) File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/commands/init.py", line 70, in execute retcode, summary = self.run_command(options) File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/commands/init.py", line 134, in run_command runner = runner_class(options) File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/commands/trim/init.py", line 282, in init super().init(options, TrimSummary) File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/commands/base.py", line 226, in init self.iterable = enumerate(reader, 1) File "/home/oender/anaconda3/envs/population-epigenetics/lib/python3.6/site-packages/atropos/io/seqio.py", line 559, in iter return self._iter(pysam.AlignmentFile(self._file)) File "pysam/libcalignmentfile.pyx", line 734, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 983, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False

seems to be related to issue here https://github.com/pysam-developers/pysam/issues/51

jdidion commented 5 years ago

You are correct - pysam requires there to be valid @SQ records in the header. I will set check_sq=False by default and I will consider adding an option to configure pysam options in case someone wants to set it to True.

okartal commented 5 years ago

Thanks @jdidion. Opening with check_sq=False by default makes more sense for unaligned SAM input. Any prediction when this will be available?

jdidion commented 5 years ago

I'll do a release next week

On Nov 16, 2018, at 7:53 AM, okartal notifications@github.com wrote:

Thanks @jdidion. Opening with check_sq=False by default makes more sense for unaligned SAM input. Any prediction when this will be available?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jdidion commented 5 years ago

1.1.20 is released now. Please try it out.

okartal commented 5 years ago

Is this release already in bioconda?

jdidion commented 5 years ago

Conda recipe has been updated and PR submitted