sherman5 / RVS

Computes estimates of the probability of related individuals sharing a rare variant.
1 stars 1 forks source link

Expected Run Time for RVS function multipleVariantPValue #5

Open klmartinez opened 5 years ago

klmartinez commented 5 years ago

When I the RVS function multipleVariantPValue on 2 exomes, it would take approximately 30 minutes. However, when I try to run the same function on 5 genome families, its been running for at least 10 days.

What is an expected run time for 5 genome families with this function?

I am trying to run this on my University's HPC but we have maximum wall times of 10 days. When the job has been terminated there are never any errors, just saying that it needed more time. I just want to make sure that this length of time is somewhat expected and doesn't instead point to another issue.

Thank you!

sherman5 commented 5 years ago

That's not expected at all, the running time should be on the scale of minutes not days. What is the size of the SnpMatrix that you are passing to the function?

klmartinez commented 5 years ago

The dimensions of my SnpMatrix is [13, 3496630].

sherman5 commented 5 years ago

We haven't previously tested RVS on data that big, so we were unaware of this bottleneck - I'm currently working on a solution (see PR #6)