CSB5 / lofreq

LoFreq Star: Sensitive variant calling from sequencing data
http://csb5.github.io/lofreq/
Other
100 stars 30 forks source link

Substitution and indel tests #47

Closed zeus19900814 closed 7 years ago

zeus19900814 commented 7 years ago

Hi Andreas,

After a Lofreq run, I saw messages like "Number of substitution tests and indel tests are: ... respectively". I am wondering what do these tests do? Are they necessary for Lofreq and for the output? If they can be turned off, how can I do that? I didn't find an input option that could do that.

Thank you very much, Zheng

andreas-wilm commented 7 years ago

Hi Zheng,

the number of tests is recorded for later multiple testing correction. The Bonferroni corrected pvalues (i.e. quality values) are computed on the fly and then the default cutoff of 5% is applied. I wouldn't turn this off, but you can, by setting the Bonferroni factor hard to 1: lofreq call -b )

Andreas

zeus19900814 commented 7 years ago

Hi Andreas,

Thank you very much for the answer. Currently my problem is I ran Lofreq(not call-parallel) on a 4GB bam file versus hg38 reference and it took 104 hours to finish. So is this expected? Is there any way it can be sped up in terms of changing input options or even tweaking a little bit of source code?

Thanks again, Zheng

andreas-wilm commented 7 years ago

This sounds strange. Is this from targeted sequencing and ultra high coverage?

zeus19900814 commented 7 years ago

Yes. This is from targeted sequencing. But I am not sure about the coverage situation.

andreas-wilm commented 7 years ago

Can you try running with -d 10000, which will limit the coverage to the resp. value, and see how it does?

On 30 March 2017 at 23:32, Zheng Qin notifications@github.com wrote:

Yes. This is from targeted sequencing. But I am not sure about the coverage situation.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/CSB5/lofreq/issues/47#issuecomment-290448500, or mute the thread https://github.com/notifications/unsubscribe-auth/ABC5CVIXh69jxQ5WT_BO3g9BxUZuKVYYks5rq8sFgaJpZM4MtPJq .

-- Andreas Wilm andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

zeus19900814 commented 7 years ago

Yeah. I will try with that and let you know how that goes. Thank you!

zeus19900814 commented 7 years ago

I ran it with 16 threads, it took ~44 hours. Furthermore with -d 10000, it took ~21 hours.

andreas-wilm commented 7 years ago

Hm...that's still a lot. Could you determine mean and median coverage e.g. from the vcf files? One way (assuming you have datamash installed) would be: zgrep -v '^#' your.vcf.gz | cut -f 8 | cut -f 1 -d ';' | cut -f 2 -d= | datamash mean 1 median 1

Thanks

On 3 April 2017 at 22:07, Zheng Qin notifications@github.com wrote:

I ran it with 16 threads, it took ~44 hours. Furthermore with -d 10000, it took ~21 hours.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/CSB5/lofreq/issues/47#issuecomment-291153944, or mute the thread https://github.com/notifications/unsubscribe-auth/ABC5CcjNn-Sga3FkMcgFffeypuocFC9aks5rsP0LgaJpZM4MtPJq .

-- Andreas Wilm andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

zeus19900814 commented 7 years ago

The one with 16 threads is 779.67501178689 43. The one with 16 threads and -d 10000 is 316.02935169793 42.

andreas-wilm commented 7 years ago

So you seem to have massive coverage spikes. Am I right assuming you didn't input a bed file to tell LoFreq about your targeted positions?

On 3 Apr 2017 23:37, "Zheng Qin" notifications@github.com wrote:

The one with 16 threads is 779.67501178689 43. The one with 16 threads and -d 10000 is 316.02935169793 42.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/CSB5/lofreq/issues/47#issuecomment-291181248, or mute the thread https://github.com/notifications/unsubscribe-auth/ABC5CfuS8B9hl58j6-KqZ-zrpzqD2lC3ks5rsRIwgaJpZM4MtPJq .

zeus19900814 commented 7 years ago

You are right. I didn't use a bed file. Does whether using a bed file matter too much about Lofreq runtime?

andreas-wilm commented 7 years ago

It does, not to such an extend though. But please do try