DiltheyLab / MetaMaps

Long-read metagenomic analysis
Other
96 stars 23 forks source link

plotIdentities_EM.R 'Genome-wide coverage over all contigs' read count inaccurate for pages with multiple mapping units being plotted. #79

Open Kamranaway opened 1 month ago

Kamranaway commented 1 month ago

'Genome-wide coverage over all contigs' (the red & blue plot) read count is wrong for pages that contain multiple mapping units. Verified this issue by manually looking up the total instances of a unique taxon ID and found the instances did not match what the PDF page reports.

for(mappingUnitI in 1:length(mappingUnits)) { reads_count <- reads_count + countsPerUnit[[mappingUnit]] mappingUnit <- mappingUnits[[mappingUnitI]] # reads_count <- reads_count + countsPerUnit[[mappingUnit]]

I noticed that the variable mappingUnit is used in this for loop before being updated to the correct mappingUnit. As such, mappingUnit is defined based on earlier lines in the script, which causes read_count to be updated to an inaccurate value.