Open jielab opened 7 years ago
I'm working on that. That does seem to unreasonably slow. I will follow up on this.
Xiaowei
On Apr 30, 2017, at 12:22 AM, jiehuang001 notifications@github.com wrote:
Please see my comment on #25, it took 43356 seconds to run a regression on 93 SNPs when I used --siteFile. But when i use bcftools first to extract those 93 SNPs to create a new VCF, which takes a minute, then it only took 109 second to run the same analysis
So, i think there is something VERY WRONG with this --siteFile option. Just want to point this out so that others don't run into the same issue.
best regards, Jie
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Can you please remind me the number of sites specified by --siteFile
option?
Just want to confirm that with you,
as I guess that very large amount of sites slows down the analysis,
Thanks.
93 sites
From: zhanxw [mailto:notifications@github.com] Sent: 2017年5月2日 16:10 To: zhanxw/rvtests rvtests@noreply.github.com Cc: jiehuang001 jiehuang001@gmail.com; Author author@noreply.github.com Subject: Re: [zhanxw/rvtests] unbelievable slow speed for --siteFile (#26)
Can you please remind me the number of sites specified by --siteFile option? Just want to confirm that with you, as I guess that very large amount of sites slows down the analysis,
Thanks.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zhanxw/rvtests/issues/26#issuecomment-298746704 , or mute the thread https://github.com/notifications/unsubscribe-auth/AZsvf42zgFuaoTJS_xlhSx_Dtt08aW8Hks5r142LgaJpZM4NMePP . https://github.com/notifications/beacon/AZsvf_Bazhd1zbDBnbUSSKsLmbdHu33-ks5r142LgaJpZM4NMePP.gif
@jiehuang001 I have optimized --siteFile
option to improve speed. However, you may consider using --rangeFile
instead.
Since you have a small amount of variants (93 variants) to analyze, I would recommend to use --rangeFile
. This option will let RVTESTS utilize the VCF index file, make RVTESTS only read in these variants and analyze them.
When you have lots of variants, --siteFile
is more appropriate, as RVTESTS will read in every variant, but only analyze the variants specified in --siteFile
.
Please see my comment on #25, it took 43356 seconds to run a regression on 93 SNPs when I used --siteFile. But when i use bcftools first to extract those 93 SNPs to create a new VCF, which takes a minute, then it only took 109 second to run the same analysis
So, i think there is something VERY WRONG with this --siteFile option. Just want to point this out so that others don't run into the same issue.
best regards, Jie