lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
482 stars 133 forks source link

Extraction fails for entire chromosome #130

Closed sum732 closed 5 years ago

sum732 commented 5 years ago

Subject of the issue

Want to extract reads for the entire chromosome, and it's not working.

Your environment

Steps to reproduce

java -jar ~/Research/Programs/SamViewWithMate/jvarkit/dist/samviewwithmate.jar -r "1:" Sample_TruSeq_Total_stranded_226a.Aligned.sortedByCoord.out.bam >IS_reads.fastq

Expected behaviour

Should get all the sequences mapped to chromosome 1

Actual behaviour

java -jar ~/Research/Programs/SamViewWithMate/jvarkit/dist/samviewwithmate.jar -r "1:" Sample_TruSeq_Total_stranded_226a.Aligned.sortedByCoord.out.bam >IS_reads.fas tq [SEVERE][SamViewWithMate]Cannot find colon in 1: java.lang.IllegalArgumentException: Cannot find colon in 1: at com.github.lindenb.jvarkit.util.bio.IntervalParser.returnErrorOrNullInterval(IntervalParser.java:116) at com.github.lindenb.jvarkit.util.bio.IntervalParser.parse(IntervalParser.java:240) at com.github.lindenb.jvarkit.tools.viewmate.SamViewWithMate.doWork(SamViewWithMate.java:152) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:756) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:919) at com.github.lindenb.jvarkit.tools.viewmate.SamViewWithMate.main(SamViewWithMate.java:332) [INFO][Launcher]samviewwithmate Exited with failure (-1)

lindenb commented 5 years ago

just use '1'

sum732 commented 5 years ago

Thanks for a prompt reply.

Tried that as well, same result. java -jar ~/Research/Programs/SamViewWithMate/jvarkit/dist/samviewwithmate.jar -r "1" Sample_TruSeq_Total_stranded_226a.Aligned.sortedByCoord.out.bam >IS_reads.fas [SEVERE][SamViewWithMate]Cannot find colon in 1 java.lang.IllegalArgumentException: Cannot find colon in 1 at com.github.lindenb.jvarkit.util.bio.IntervalParser.returnErrorOrNullInterval(IntervalParser.java:116) at com.github.lindenb.jvarkit.util.bio.IntervalParser.parse(IntervalParser.java:240) at com.github.lindenb.jvarkit.tools.viewmate.SamViewWithMate.doWork(SamViewWithMate.java:152) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:756) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:919) at com.github.lindenb.jvarkit.tools.viewmate.SamViewWithMate.main(SamViewWithMate.java:332) [INFO][Launcher]samviewwithmate Exited with failure (-1)

lindenb commented 5 years ago

sorry, I'm not sure I've implemented this feature , get the length of the chromosome "1" and use it -r "1:1-1234567"

lindenb commented 5 years ago

by the way, I'm not sure the program is designed to fetch all the reads of such large region. You might have to increase memory with -Xmx https://stackoverflow.com/questions/5374455

lindenb commented 5 years ago

for a whole chromosome, you'd better use:

samtools view -h input.bam | awk -F '\t' '$0 ~ /^@/ || $3=="chr3" || $7=="chr3"'

sum732 commented 5 years ago

okay thanks. Much appreciated. Will close