nhansen / SVanalyzer

Tools for the analysis of structural variation in genomes
http://svanalyzer.readthedocs.io/
Other
76 stars 14 forks source link

Problems with the output file of the delta2sam.pl script #15

Closed LichengXX closed 1 year ago

LichengXX commented 1 year ago

Hi Nancy, I am Li cheng and a M.A. students from HZAU, Wuhan. I used your delta2sam.pl script, but the fifth column of my sam file, i.e. the MAPQ value, is all '0'. After I converted sam to bam file, the result file for extracting unmatched upper sequences through the resulting bam file was empty. But I used minimap2 to compare the two genomes and the resulting sam files were fine. I would like to ask if this situation is a problem with my delta file from Nucmer or a problem with the way the delta2sam.pl script is used? And I have used the delta2sam.pl script without any errors. Parts of my delta file, sam file and minimap2 sam file are in the attachment. I look forward to and appreciate your reply.

Best Wishes, Li cheng Part_of_delta_file.txt Part_of_delta2sam,pl_outfile.txt Part_of_minimap2_out_sam.txt

nhansen commented 1 year ago

Hi Li cheng,

The excerpts of the sam files you attached don't seem to have full lines (when I view them, they have lines that are truncated after 158 characters with a ">" character afterward) and the last line has no line feed character. Could this be the reason the sam file is not converting correctly to bam? Do you get an error when converting from sam to bam (with samtools, I'm guessing?)

The 0 map quality is due to my not having implemented either (a) some prescription for converting delta file scores to map quality scores or (b) an option to specify a different mapping quality to applied to all alignments. In any case, 0 map quality values shouldn't affect your ability to convert to bam format or view alignments.

Hope that helps. Please provide those example sam files with full length lines, and I'll take a look at the code to see if there's something buggy going on. Thanks!

LichengXX commented 1 year ago

Thank you for your explanation. I now know that the current problem has nothing to do with MAPQ values. This is part of my sam file with full length lines and delta outfile. I find my flag values very strange, all of them are 0 or 16 . I think it might be the problem with my sam file . (My samtools command: samtools view -f 4 R18.sam > R18.unmapped) The samtool program does not recognize the value 4 in the flags,resulting in my bam file being empty. So, I don't know if the flag value exception is caused by my Nucmer's delta file or if the delta2sam.pl script parameter was used incorrectly. I am really looking forward to your answers.

Best Wishes! R18.sam.txt Part_of_delta_file.txt

nhansen commented 1 year ago

Your -f 4 option is the reason no alignments are being printed. The -f 4 prints unmapped reads in the BAM file. Since delta2sam.pl is creating a sam file from a set of mapped reads, none of them have the 4 bit set in their flag value.

LichengXX commented 1 year ago

I got it! Thank u so much! Best wishes to u!