Atkinson-Lab / Tractor

Scripts for implementing the Tractor pipeline
MIT License
44 stars 5 forks source link

Getting errors after running extract_tracts.py #28

Closed EfraMP closed 7 months ago

EfraMP commented 7 months ago

Hi,

After running

$ python3 Tractor/scripts/extract_tracts.py --vcf subset1/query_file_phased.vcf --msp subset1/query_results.msp --output-dir output/ --num-ancs 8
INFO (__main__ 91): # VCF File                    : subset1/query_file_phased.vcf
INFO (__main__ 92): # Prefix of output file names : query_file_phased
INFO (__main__ 93): # VCF File is compressed?     : False
INFO (__main__ 94): # Number of Ancestries in VCF : 8
INFO (__main__ 95): # Output Directory            : output/
INFO (__main__ 101): Creating output files for 8 ancestries
INFO (__main__ 116): Iterating through VCF file
Traceback (most recent call last):
  File "/path/Tractor/scripts/extract_tracts.py", line 240, in <module>
    extract_tracts(**vars(args))
  File "/patj/Tractor/scripts/extract_tracts.py", line 170, in extract_tracts
    window = (ancs_entry[0], int(ancs_entry[1]), int(ancs_entry[2]))
ValueError: invalid literal for int() with base 10: '0.0'

Both vcf and msp are output files from the LAI tool G-Nomix, using the pre-trained model, using 8 ancestries. The vcf seems complete, so suspect the issue is regarding the msp, which has the next header:

#Subpopulation order/codes: EUR=0       EAS=1   NAT=2   AFR=3   SAS=4   AHG=5   OCE=6   WAS=7
#chm    spos    epos    sgpos   egpos   n snps  sample_1 sample_2 ... sample_n
        13273   779322  0.0     2.02544 696     5       5       5
...

Maybe the issue is with the EUR=0 tag, the void tag of the chromosome or the 0.0 of the centimorgan positions.

Any help will be appreciated. Thank you.

JasonTan-code commented 7 months ago

Seems like #chm does not have a number to indicate chromosome number; Does this cause the problem?

EfraMP commented 7 months ago

Seems like #chm does not have a number to indicate chromosome number; Does this cause the problem?

Ah yes, forgot to update. Indeed that was the issue