jodyphelan / TBProfiler

Profiling tool for Mycobacterium tuberculosis to detect ressistance and strain type from WGS data
GNU General Public License v3.0
105 stars 43 forks source link

long time no responds on some samples #404

Closed abcdtree closed 1 week ago

abcdtree commented 2 weeks ago

Hi Dear team,

Thank you very much for providing such good tool.

I was trying to run tb-profiler on long reads data (ONT).

Here is the command I used:

tb-profiler profile -t 20 -1 my.fastq.gz --ram 100 -m nanopore -d myoutput -p prefix --txt

There are two samples with similar size 1.1G. One of them was analysed fast in less than 20 minutes. And the other one takes 40 hours but still saying

Calling variants:   0%|          | 0/73 [00:00<?, ?it/s]

I have 12 samples in total with file size similar around 1G, but only 5 of them ran smoothly, and the others stuck there. Do you have any idea what makes the analysis on the second reads stop? Have you faced this issue before?

Thanks,

Josh

jodyphelan commented 2 weeks ago

Hi @abcdtree

The default variant caller is currently freebayes which works very well for illumina data but struggles with the higher error profile of nanopore reads. Try change it to bcftools with --caller bcftools and see if that works any better for you. I'm currently in the process of implementing clair3 as a variant caller for linux systems

abcdtree commented 2 weeks ago

@jodyphelan Thanks so much.

abcdtree commented 1 week ago

@jodyphelan Just let you know --caller bcftools fixed the problem. Thank you for your help.

jodyphelan commented 1 week ago

Great to hear!