nf-core / taxprofiler

Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
https://nf-co.re/taxprofiler
MIT License
117 stars 33 forks source link

Decontamination: add hostile for host removal #346

Open jfy133 opened 1 year ago

jfy133 commented 1 year ago

Description of feature

Add the HOSTILE tool for host decontamination

https://www.biorxiv.org/content/10.1101/2023.07.04.547735v1

https://github.com/bede/hostile

LilyAnderssonLee commented 1 year ago

It's interesting to see the various reference genome options available in HOSTILE, such as combining the human T2T genome with HLA, argos985 and mycob140. This combination can filter out human reads from the MTC region and retain more microbial reads. For additional details, please check: https://github.com/bede/hostile/issues/27

jfy133 commented 1 year ago

I think the reference genome is often the most important factor!

jfy133 commented 1 year ago

so indeed this will be helpful

LilyAnderssonLee commented 1 year ago

@jfy133 I agree with you. In our clinical cases, it's quite often to see that false positives caused by human rRNA. Therefore I am considering to include human rRNA databases in the reference genome. Even though the human Telomere-to-Telomere (T2T) assembly offers better results compared to GRCh38, assembling rRNA remains a challenge due to its high repetitiveness and significant variations between individuals, ages, and even certain types of cancers.

jfy133 commented 1 year ago

Not a bad idea! But I'm not sure if at that point it is worth investing so much into the host reference, rather than checking whether the microbial reference genomes are clean...