JiscaH / sequoia

R package for pedigree inference based on SNP data
25 stars 6 forks source link

Sequoia running forever #16

Closed vunguyen1907 closed 1 year ago

vunguyen1907 commented 2 years ago

Hi.

I am running sequoia for 4702 genotyped animals with 30K SNP, 500 SNP, and 200 SNP. 4702 animals are the same birth year and have no pedigree, they are from fish cages, produced by m x n parents but I did not know how many. I would like to test if there is a ped construct for the data. It seemed Sequoied is running non-stop for almost a week without any output with 200SNPs even in my Mac (16 GBs) and HPC (5 cores, 124 Gbs). I am using Sequoia_notR.

Could you please help me with how to faster the run or/and parameter setting for my data?

Much appreciated!

Vu Nguyen

JiscaH commented 2 years ago

Hello Vu,

What I think is happening is that your 200 SNPs do not contain enough information to infer relationships, so it keeps 'wobbling' between alternative solutions. It should be possible to check whether this is going on by looking at how the total likelihood changes over time. Adding the flag --verbose may give extra information about this.

One thing that may help is to use an ageprior specifying that generations do not overlap, i.e. that any 2nd degree relatives present in the sample are half-siblings, and not aunts/uncles. This may perhaps not be strictly true, but is a common and perhaps in your case necessary assumption to make. See https://jiscah.github.io/articles/vignette_age/book/pedigree-based-ageprior.html#sec:Discrete for more details.

An other parameter that will strongly affect performance and runtime is the assumed error rate - as the assumed and/or actual error rate increases, runtime increases a lot; see https://jiscah.github.io/articles/vignette_main/book/key-points.html#effect-on-runtime

I hope this helps! And if it does works OK with 30K and 500 SNPs (I hope?), it may simply not be feasible with 200 SNPs.

Best, Jisca

vunguyen1907 commented 2 years ago

Dear Jiscar,

Thanks for your suggestions.

Regards,

Vu