DivyaratanPopli / Kinship_Inference

This is a tool to estimate pairwise relatedness from ancient DNA, taking in account contamination, ROH, ascertainment bias.
GNU General Public License v3.0
7 stars 2 forks source link

KinGAROO freezes when estimating hapProbs #16

Closed pierrespc closed 5 months ago

pierrespc commented 9 months ago

Hi,

I can't make kingGAROO to finish the hapProbs estimation. I am running it on 6 test samples. the chrm{chr}.bam and chrm$chr.sorted.bam{,.bai} files has been created by the software. Then it went to estimate the hapProbs. It was running well as it created the chrm$chr.{diffs,probs} files for 5 samples and the 22 chromosomes. BUT for the last sample in the list it got stucked and did not create theese files for chromosomes 4 and 14. Note that the program did not exit and is still running but I have no error in stderr and the stdout just gave me "Indexing ". Have you notice this behaviour before? What can be done to avoid the software to freeze?

Thanks.

Best

DivyaratanPopli commented 9 months ago

Hi, I have not seen this behavior. Can you run the script with 5 samples to see if there is still a problem? Is the 6th file much bigger than the rest (is it shotgun data)? You can check if the program is doing something or is just idle with htop, maybe it just requires more time. If it is shotgun data, you can filter out all the monomorphic sites from the bed file to make things faster.

pierrespc commented 9 months ago

Hi,

thanks for your answer. I have run now on 5 samples as you suggest. Note that the bed file I give as input for now as a test is the bed file for the 1240K SNP panel. Also the program was not doing anything and was in the same state all over the weekend. I'll let you know how it goes with the 5 samples test. Best

pierrespc commented 9 months ago

Ok, it went through the hapProbs process. I got an error much after ("hbd_hmm_functions.py:355: RuntimeWarning: invalid value encountered in long_scalars goodprop=np.sum((harr>=0.7) & (harr<1.1))/np.sum(harr<=1.1)") but I guess it is because one genome is at very low coverage. I am now trying with genomas with higher coverage. Thanks!

pierrespc commented 9 months ago

Ok. now it ran OK and i got consistent results from other methods with 5 samples. Looks like the frozen status issue was because of memory. Today, I had again the issue as before with samples at higher coverage. And although I didn't get any error, I cancelled the job and submit it with higher memory, and it ran OK. It is weird that when memory is not sufficient, the process freezes instaed of returning an error. Now I'll scale up the analyses. Thanks for your help. Best