Open arpit20328 opened 2 months ago
Hello,
Thank you for reaching out.
You should NOT sort the bam files by coordinates. This will lead to incorrect results.
Since you are filtering using --besthit
option, the input should be sorted by read name (QNAME
) and not coordinates. Please see last paragraph of Section 4 in the documentation.
For profiling, again it is expected that bam file is sorted by QNAME
. Please see the warning in the 2nd paragraph of Section 5 in the documentation.
Finally, we recommend you to use the filtered bam from msamtools filter
as input to msamtools profile
, not the unfiltered bam file as you have shown. So I would suggest:
msamtools filter -b -l 80 -p 95 -z 80 --besthit bacterial_alignment_normal.bam > sample.bam
msamtools profile --multi=proportional --label=sample1 --unit=rel -o sample1.IGC.profile_normal.txt.gz sample.bam
or even simpler with piping:
msamtools filter -b -l 80 -p 95 -z 80 --besthit bacterial_alignment_normal.bam \
| msamtools profile --multi=proportional --label=sample1 --unit=rel -o sample1.IGC.profile_normal.txt.gz -
Hope this helps. Please let us know if you have any other questions.
@microbiomix I will get back to you once I apply it. thanks for initiating a discussion
So I generated sample.bam file
I also generated its sorted_sample.bam file as well
I then use msamtools
msamtools filter -b -l 80 -p 95 -z 80 --besthit bacterial_alignment_normal.bam > sample.bam msamtools filter -b -l 80 -p 95 -z 80 --besthit bacterial_alignment_sorted.bam > sorted_sample.bam
msamtools profile --multi=proportional --label=sample1 --unit=rel -o sample1.IGC.profile_normal.txt.gz bacterial_alignment_normal.bam
msamtools profile --multi=proportional --label=sample1 --unit=rel -o sample1.IGC.profile_sorted.txt.gz bacterial_alignment_sorted.bam
I am seeing different results in both sample1.IGC.profile_normal.txt.gz and in sample1.IGC.profile_normal.txt.gz
Please let me know what is the correct way ? using normal generated bam file or sorted bam file ?