Closed jasvinderahuja closed 4 years ago
Hi, recently changed the code, I looked fine but may be there is a bug.
Your link to dropbox returns 404.
Thank you for the script it is a lifesaver, so simple and effective! I see the headers have changed and become even more informative. My error was due to unexpected fasta file. With some reads spanning the interval >1 time. It gave me the opportunity to learn pysam, which alerted me to the error. You may find it useful to add compatibility to handle such files, or give a more informative error in such cases.
Verify
java. - 1.8.0_262
Subject of the issue
It works on some files and not on others. In most files sam2tsv makes extraordinarily large files > 100G. I am using Pacbio reads and have removed unmapped reads. Peculiarly, READ-POS0=. READ-BASE=. READ-QUAL=. REF-POS1=. CIGAR-OP = H and yet is still goes on...
Your environment
Steps to reproduce
I have shared the files in this link: https://www.dropbox.com/sh/1ermle431f47cze/AADNlOZtsIfZN0JpCp5tG0Wca?dl=0 cmd: java -jar /home/ahujajs/modulefiles/jvarkit/dist/sam2tsv.jar -R STE50toFUS1_S4921.fasta PCR_571_gatk.bam > PCR_571_gatk.sam2tsv.out
Expected behaviour
tsv file
Actual behaviour
It makes huge (>100G) tsv file and goes on calling differences even when alignments have run off