ksamuk / pixy

Software for painlessly estimating average nucleotide diversity within and between populations
https://pixy.readthedocs.io/
MIT License
115 stars 14 forks source link

Error message - TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType #68

Closed aliceatikesse closed 1 year ago

aliceatikesse commented 1 year ago

Hello! I am very new to Pixy and I'm not familiar with the VCF format and I get various error messages when I am trying to run the command to obtain the pi value. I am originally working with a DarTseq genomic file and I had to covert my pre-filtered data to a vcf format, which I did through R with a fonction in the dartR package. Somehow, the conversion made it impossible to transfer the chromosome numbers, which mean I had to assign "1" to all my SNPs.

Question 1: Is the chromosome number a big issue? Question 2: Does my vcf file format is the source of the error message I obtain? Question 3: Is there a pixy tutorial available other than the "pixy--help section" and the GUIDE webpage where I can find detailed meanings of the lines "zarr_path", "window_size" etc.?

Thank you so much for you help,

Alice

(1)The command I used:

--vcf glchloro_vcf.vcf \ --populations glchloro5_popfile_pixy.txt \ --window_size 10000 \ --zarr_path \ --variant_filter_expression \ --invariant_filter_expression \ --chromosome "1"

The error message: Traceback (most recent call last): File "/Users/aliceatikesse/opt/anaconda3/envs/pixy2/bin/pixy", line 11, in sys.exit(main()) File "/Users/aliceatikesse/opt/anaconda3/envs/pixy2/lib/python3.6/site-packages/pixy/main.py", line 72, in main if os.path.exists(args.zarr_path) is not True: File "/Users/aliceatikesse/opt/anaconda3/envs/pixy2/lib/python3.6/genericpath.py", line 19, in exists os.stat(path) TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

(2)VCF FILE:

fileformat=VCFv4.2

fileDate=20230108

source=PLINKv1.90

contig=

INFO=

FORMAT=

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT P2_02-01 P9_09-01 P10_10-01 P11_11-01 P2_02-02 P9_09-02 P10_10-02 P2_02-03 P9_09-03 P10_10-03 P11_11-03 P2_02-04 P10_10-04 P11_11-04 P2_02-05 P10_10-05 P11_11-05 P2_02-06 P9_09-06 P10_10-06 P11_11-06 P9_09-07 P10_10-07 P11_11-07 P9_09-08 P10_10-08 P11_11-08 P10_10-09 P11_11-09 P10_10-10 P11_11-10 P7_07-X-C01 P7_07-X-C02 P7_07-X-C03 P7_07-X-C04 P7_07-X-C05 P7_07-X-C06 P7_07-X-C07 P7_07-X-C08 P7_07-X-C09 P7_07-X-C10 P8_08-X-C01 P8_08-X-C02 P8_08-X-C03 P8_08-X-C04 P8_08-X-C05 P8_08-X-C06 P8_08-X-C07 P8_08-X-C08 P8_08-X-C09 P8_08-X-C10

1 5 100112682-5-G/A G A . . PR GT 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 1 5 14538103-5-G/A G A . . PR GT 1/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 1/1 1/1 0/1 0/1 1/1 0/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/0 1/1 1/1 1/1 1/1 0/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1 5 3335069-5-T/G T G . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 ./. 0/0 0/0 ./. ./. 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 ./. 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 ./. 0/0 0/0 0/0 0/0 0/0 0/0 0/0 1/1 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 1 5 3314305-5-C/T C T . . PR GT 0/1 0/0 0/0 0/1 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 1 5 14533956-5-G/A G A . . PR GT 0/0 0/0 ./. 0/0 0/0 0/1 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 ./. 0/0 0/1 0/0 0/0 0/0 0/0 1/1 0/0 0/0 0/0 0/0 0/0 0/0 ./. 0/1 0/0 0/0 0/0 1/1 0/0 0/0 0/1 0/0 0/0 0/0 ./. 0/1 0/0 1/1 0/0 0/0 0/0 0/0 ./. ./. 1 5 3317257-5-G/A G A . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/1 0/0 0/0 0/0 0/1 1/1 0/1 0/1 1 5 14557305-5-A/G A G . . PR GT 0/0 0/0 0/1 0/0 0/0 0/0 0/0 ./. 0/0 0/0 0/0 ./. 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 1/1 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 ./. 0/1 0/0 1/1 1/1 0/0 0/0 0/1 1 5 3311492-5-T/C T C . . PR GT ./. 0/1 1/1 0/0 0/0 1/1 0/1 0/0 0/1 0/1 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 1/1 0/0 0/0 0/0 0/1 0/0 0/0 1/1 0/0 1/1 0/0 0/0 0/0 0/0 0/0 0/1 0/1 0/0 ./. 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 1/1 0/1 1 5 100238399-5-G/C G C . . PR GT 1/1 0/1 0/0 0/0 0/0 0/0 1/1 0/1 0/0 0/1 0/1 0/1 0/0 1/1 0/1 0/1 1/1 0/0 0/0 1/1 0/1 1/1 1/1 0/0 0/0 0/1 0/0 0/1 0/1 0/0 0/1 0/1 1/1 0/1 0/1 0/1 0/1 0/0 0/1 0/1 0/1 0/0 0/1 0/1 0/1 0/0 1/1 0/1 0/1 0/1 0/0 1 5 100265797-5-A/G A G . . PR GT 1/1 1/1 0/0 0/0 0/0 0/1 1/1 0/0 0/0 0/0 0/0 0/0 0/0 ./. 0/0 0/0 0/1 0/1 0/0 0/1 0/0 1/1 0/1 0/1 0/0 0/0 0/0 0/0 1/1 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0 1/1 0/0 1/1 0/1 0/1

(2)POP FILE: 02-01 P2 09-01 P9 10-01 P10 11-01 P11 02-02 P2 09-02 P9 10-02 P10 02-03 P2 09-03 P9 10-03 P10 11-03 P11 02-04 P2 10-04 P10 11-04 P11 02-05 P2 10-05 P10 11-05 P11 02-06 P2 09-06 P9 10-06 P10 11-06 P11 09-07 P9 10-07 P10 11-07 P11 09-08 P9 10-08 P10 11-08 P11 10-09 P10 11-09 P11 10-10 P10 11-10 P11 07-X-C01 P7 07-X-C02 P7 07-X-C03 P7 07-X-C04 P7 07-X-C05 P7 07-X-C06 P7 07-X-C07 P7 07-X-C08 P7 07-X-C09 P7 07-X-C10 P7 08-X-C01 P8 08-X-C02 P8 08-X-C03 P8 08-X-C04 P8 08-X-C05 P8 08-X-C06 P8 08-X-C07 P8 08-X-C08 P8 08-X-C09 P8 08-X-C10 P8

ksamuk commented 1 year ago

Hi There,

It looks like you may be referring to an older version of pixy -- all the filtration steps and use of zarr files were removed a while ago. So, for starters, I'd update to the latest version.

Secondly, it looks like you haven't compressed and indexed your vcf using bgzip and tabix. To do that, once htslib is installed, just type: bgzip [your.file.vcf], followed by tabix [your.file.vcf.gz].

The chromosome issue will likely cause problems if there are sites with the same chromosome and position (this seems likely). Otherwise, no, but you'll need a way to translate the sites back to the correct coordinate for your later analyses.

Hope that helps!

Kieran