mbhall88 / head_to_head_pipeline

Snakemake pipelines to run the analysis for the Illumina vs. Nanopore comparison.
GNU General Public License v3.0
5 stars 2 forks source link

Get coverage info for all samples #73

Closed mbhall88 closed 3 years ago

mbhall88 commented 3 years ago

A key component of the drug resistance prediction will be investigating what part (if any) coverage plays in Nanopore prediction ability.
Given we are working with reads we know map to H37Rv only, just using the theoretical coverage (i.e. number of bases divided by genome size) should be sufficient. This value is actually already present in the rasusa log files from subsampling in the QC pipeline.

One thought though is whether we want to use these "filtered" reads, or whether we want to use the "real" data as this is probably what people will commonly have. I am happy to use the filtered stuff, I just wanted to make sure we addressed this.

iqbal-lab commented 3 years ago

People will have easy access to the properly filtered data, they can easily map to h37rv, and will want to remove human.ie go ahead with filtered

mbhall88 commented 3 years ago

I had forgotten that this information is produced by the QC pipeline in https://github.com/mbhall88/head_to_head_pipeline/commit/6e05dc588c9dc1d31e744ef98434632471c83aba