hringbauer / ancIBD

Detecting IBD within low coverage ancient DNA data. Development Repository for software package that contains code for manuscript.
GNU General Public License v3.0
10 stars 3 forks source link

Converting example vcf to HDF5: Error at interpolation step #13

Closed dorozco1504 closed 1 year ago

dorozco1504 commented 1 year ago

Hello,

I am attempting to convert the example VCF file from Dropbox to HDF5, but I am facing an error during the interpolation step. Do you have any insights on what might be causing this issue?

Thank you!

Print downsampling to 1240K...
Running bash command: 
bcftools view -Ov -o test/example_hazelton.ch20.1240k.vcf -T ./data/filters/snps_bcftools_ch20.csv -M2 -v snps ./data/vcf.raw/example_hazelton_chr20.vcf
Finished BCF tools filtering to target markers.
Converting to HDF5...
Finished conversion to hdf5!
Merging in LD Map..
Lifting LD Map from eigenstrat to HDF5...
Loaded 28940 variants.
Loaded 6 individuals.
Loaded 0 Chr.20 1240K SNPs.
Intersection 0 out of 28940 HDF5 SNPs
Interpolating 28940 variants.
Traceback (most recent call last):
  File "/data/users/dorozco/.conda/envs/gnomix/bin/ancIBD-run", line 8, in <module>
    sys.exit(main())
  File "/data/users/dorozco/.local/lib/python3.8/site-packages/ancIBD/run_ancIBD.py", line 63, in main
    vcf_to_1240K_hdf(in_vcf_path = args.vcf,
  File "/data/users/dorozco/.local/lib/python3.8/site-packages/ancIBD/IO/prepare_h5.py", line 135, in vcf_to_1240K_hdf
    merge_in_ld_map(path_h5=path_h5, 
  File "/data/users/dorozco/.local/lib/python3.8/site-packages/ancIBD/IO/h5_modify.py", line 154, in merge_in_ld_map
    rec_ch = np.interp(x1, x, y) 
  File "<__array_function__ internals>", line 180, in interp
  File "/data/users/dorozco/.conda/envs/gnomix/lib/python3.8/site-packages/numpy/lib/function_base.py", line 1594, in interp
    return interp_func(x, xp, fp, left, right)
ValueError: array of sample points is empty
hringbauer commented 1 year ago

The problem starts here (see your output):

Loaded 0 Chr.20 1240K SNPs.

That means the .snp file you provided for the map does not work out. What is your input, in particular for map_path?

dorozco1504 commented 1 year ago

My input is./data/map/v51.1_1240k.snp from the vignette directory. This is my full bash command:

ancIBD-run --vcf ./data/vcf.raw/example_hazelton_chr20.vcf --ch 20 --out test --marker_path ./data/filters/snps_bcftools_ch20.csv --map_path ./data/map/v51.1_1240k.snp --af_path ./data/afs/v51.1_1240k_AF_ch20.tsv --prefix example_hazelton
hyl317 commented 1 year ago

Hi Daniela, feel free to send your input file (all the file --vcf, --marker_path, --map_path, --af_path) to my email (yilei_huang@eva.mpg.de) and I can do some checks. I just tried to run the command myself and it runs as normal.

hyl317 commented 1 year ago

Hi Daniela, there is a problem with the v51.1_1240k.snp file you provided. It only has data up to chromosome 8. Not sure why this happend. I just downloaded a fresh copy of it from dropbox and it looks fine. Please redownload this file and check it again.

dorozco1504 commented 1 year ago

Thank you so much! It worked properly!

Transformation complete! Find new hdf5 file at: test/example_hazelton.ch20.h5