DivyaratanPopli / Kinship_Inference

This is a tool to estimate pairwise relatedness from ancient DNA, taking in account contamination, ROH, ascertainment bias.
GNU General Public License v3.0
7 stars 1 forks source link

Missing likfiles after KINgaroo #17

Closed batelz closed 6 months ago

batelz commented 8 months ago

Hi, I've installed KINgaroo and KIN, and seems like I've successfully ran KINgaroo. When running KIN: KIN -I . -O . I get the following output:

HMM run for pair I3970_v41.0_._I28593 # these are the correct samples

Merging files...
Traceback (most recent call last):
  File "/home/batel/Kinship_Inference/bin/KIN", line 8, in <module>
    sys.exit(__main__.main())
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/KIN/__main__.py", line 74, in main
    helpers.hmm_all(
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/KIN/hmm_scripts/helpers.py", line 79, in hmm_all
    hmm_results(outfolder, listf, allrel)
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/KIN/hmm_scripts/helpers.py", line 66, in hmm_results
    relatable=getRelatable(filist=likfile, outfolder=outfolder, pairs=listf, rels=allrel)
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/KIN/hmm_scripts/hmm_functions.py", line 431, in getRelatable
    lik=np.loadtxt(fi,dtype='float', delimiter = ",")
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/numpy/lib/npyio.py", line 1067, in loadtxt
    fh = np.lib._datasource.open(fname, 'rt', encoding=encoding)
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/numpy/lib/_datasource.py", line 193, in open
    return ds.open(path, mode, encoding=encoding, newline=newline)
  File "/home/batel/Kinship_Inference/lib/python3.9/site-packages/numpy/lib/_datasource.py", line 533, in open
    raise IOError("%s not found." % path)
OSError: ./likfiles/I3970_v41.0_._I28593.csv not found.

I do see output from KINgaroo, as well as the the directory ./likfiles, but its empty.

Any idea whot could be the issue?

DivyaratanPopli commented 8 months ago

Can you check that all the input files are as described in the README on github? If you find everything is ok there, then can you check the output files from KINgaroo (input_diffs_hmm.csv, input_total_hmm.csv)? These files contain the differences and the total number of overlapping sites between each pair of individuals in windows along the genome.

Mvwestbury commented 8 months ago

Hi, I have the same problem. KINgaroo seems to run but then I get the same error Just for your background, I have a non-human dataset so had some trouble converting the genbank ID scaffold headers to numbers but I am pretty sure I managed to figure it out. I have just been testing it on 2 ancient samples of ~0.2x coverage before using KIN on a larger dataset.

Also, how did you infer the reference and alternative alleles as I am not sure I did it correctly? If you only have 2 low coverage samples that seems difficult. I used a population dataset including some modern individuals and called genotypes, then based on that I obtained the sites and alleles. Also, since these are just single sites I was not sure if I should +1 or -1 to the SNP site to turn it into a region.

Anyway, if you have the time to check to see if I have made some obvious mistake/could help get it running, I have uploaded all the relevant files as a tar.gz file here https://sid.erda.dk/share_redirect/E1Z4G3gSJO (note when you download, it doesn't give the tar.gz attachment so you may need to rename it and add it).

I hope that was clear enough, if not please let me know

Thanks, Mick

batelz commented 7 months ago

Hi, @DivyaratanPopli thanks for your help! As you suspected, the problem is trying to run too few samples. I re-ran with 4 samples of the same population and it worked fine.

Thank you again, Batel