We noticed that there are Sars-Cov-2 hits in the Kraken report table on the taxonomy report. The Kraken2, which generates this table uses only the unaligned reads, therefore these hits should in an ideal world not be there. In order to get a sense of how many reads are actually not being aligned in the first step and therefore end up in the taxonomy table, we thought it could be good to add more information about this in the report.
This information should include:
number of total reads
number of aligned reads (to Sars-Cov-2)
number of unaligned reads
number of sars-Cov-2 hits/reads in the unaligned reads
The format could maybe be a diagram? Important is the information - how many sars-cov-2 reads are we missing in the alignment step and should that worry us?
This could be done in the markdown file which generates the report but either the number of total reads or the number of aligned reads is needed (the other numbers can be taken from the Krakenreport file). This could be done with a small bash script.
We noticed that there are Sars-Cov-2 hits in the Kraken report table on the taxonomy report. The Kraken2, which generates this table uses only the unaligned reads, therefore these hits should in an ideal world not be there. In order to get a sense of how many reads are actually not being aligned in the first step and therefore end up in the taxonomy table, we thought it could be good to add more information about this in the report.
This information should include:
The format could maybe be a diagram? Important is the information - how many sars-cov-2 reads are we missing in the alignment step and should that worry us?
This could be done in the markdown file which generates the report but either the number of total reads or the number of aligned reads is needed (the other numbers can be taken from the Krakenreport file). This could be done with a small bash script.