ksamuk / pixy

Software for painlessly estimating average nucleotide diversity within and between populations
https://pixy.readthedocs.io/
MIT License
115 stars 14 forks source link

IndexError: list index out of range #67

Closed jiangqiuqiuu closed 1 year ago

jiangqiuqiuu commented 1 year ago

Hi, I am trying to run pixy on my laptop with the cmd:pixy --stats pi dxy fst --vcf populations.snps_sorted.vcf.gz --population sel_ariz.list --window_size 10000 --bypass_invariant_check yes,

The error msg was: `[pixy] pixy 1.2.7.beta1 [pixy] See documentation at https://pixy.readthedocs.io/en/latest/

[pixy] Validating VCF and input parameters... [pixy] Checking write access...OK [pixy] Checking CPU configuration...OK [pixy] Checking for invariant sites...WARNING [pixy] EXTREME WARNING: --bypass_invariant_check is set to 'yes'. Note that a lack of invariant sites will result in incorrect estimates. [pixy] Checking chromosome data...OK [pixy] Checking intervals/sites...OK [pixy] Checking sample data...OK [pixy] All initial checks past!

[pixy] Preparing for calculation of summary statistics: pi, dxy, fst [pixy] Using Weir and Cockerham (1984)'s estimator of FST. [pixy] Data set contains 11 population(s), 2913 chromosome(s), and 100 sample(s) [pixy] Window size: 10000 bp

[pixy] Started calculations at 23:34:26 on 2022-12-09 [pixy] Using 1 out of 8 available CPU cores

[pixy] Processing chromosome/contig maker-Contig264|quiver|pilon-snap-gene-0.22-mRNA-1... /bin/sh: quiver: command not found /bin/sh: pilon-snap-gene-0.22-mRNA-1: command not found IndexError: list index out of range`

I am not sure what's going on here as I ran pixy successfully last time...

Here is a part of the vcf file I was trying to inout: maker-Contig264|quiver|pilon-snap-gene-0.22-mRNA-1 861 7:111:- C T . PASS NS=80;AF=0.288 GT:DP:AD:GQ:GL ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. 0/1:69:45,24:40:-46.13,0,-105.31 1/1:117:0,117:40:-326.73,-34.94,0 ./.:.:.:.:. 0/0:161:161,0:40:0,-48.61,-450.69 0/0:22:22,0:40:-0,-6.87,-61.98 1/1:20:0,20:40:-55.47,-5.81,-0 0/0:26:25,1:40:-0,-5.27,-67.57 0/1:111:49,62:40:-139.78,0,-103.89 0/0:2:2,0:13:-0.06,-0.92,-6.1 0/1:43:20,23:40:-51.15,0,-43.21 1/1:11:0,11:38:-30.31,-3.11,-0 0/1:21:3,18:29:-43.77,-0,-2.28 0/1:70:30,40:40:-90.58,0,-63.07 1/1:78:0,78:40:-217.67,-23.23,0 0/1:94:43,51:40:-114.13,0,-92.21 ./.:.:.:.:. 0/1:5:2,3:40:-6.63,-0,-4.29 1/1:17:0,17:40:-47.09,-4.91,-0 1/1:18:0,18:40:-49.88,-5.21,-0 0/1:23:5,18:40:-43.17,-0,-7.27 ./.:.:.:.:. 0/1:26:8,18:40:-42.27,0,-14.76 0/1:15:6,9:40:-20.4,-0,-12.47 1/1:83:0,83:40:-231.65,-24.73,0 1/1:13:0,12:40:-33.1,-3.41,-0 0/1:143:55,87:40:-200.39,0,-111.35 1/1:124:0,124:40:-346.31,-37.05,0 1/1:221:0,220:40:-614.77,-65.88,0 ./.:.:.:.:. 0/1:103:57,45:40:-94.95,0,-128.96 0/0:19:19,0:40:-0,-5.97,-53.59 0/0:20:20,0:40:-0,-6.27,-56.38 0/0:15:15,0:40:-0,-4.76,-42.4 0/0:29:29,0:40:-0,-8.97,-81.55 0/0:14:14,0:40:-0,-4.46,-39.61 0/0:88:87,0:40:-0,-26.39,-243.75 0/0:127:127,0:40:0,-38.4,-355.61 0/0:20:19,0:40:-0,-5.97,-53.59 0/0:14:13,1:22:-0.01,-1.68,-34.02 0/0:11:11,0:40:-0,-3.56,-31.22 ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. ./.:.:.:.:. 0/0:67:67,0:40:0,-20.38,-187.82 0/0:199:199,0:40:-0,-60.03,-556.95 0/0:54:54,0:40:-0,-16.48,-151.46 0/0:25:25,0:40:-0,-7.77,-70.37 0/0:307:307,0:40:0,-92.46,-858.97 0/0:616:611,4:40:0,-173.78,-1697.91 0/0:89:88,0:40:0,-26.69,-246.54 0/0:189:189,0:40:0,-57.02,-528.99 0/0:58:57,0:40:0,-17.38,-159.85 0/0:107:106,0:40:0,-32.09,-296.88 0/0:2:2,0:13:-0.06,-0.92,-6.1 0/0:140:140,0:40:0,-42.31,-391.96 0/1:73:34,38:40:-84.38,0,-73.65 ./.:.:.:.:. ./.:.:.:.:. 0/1:137:66,71:40:-157.14,0,-143.62 ./.:.:.:.:. ./.:.:.:.:. 0/0:4:4,0:20:-0.01,-1.48,-11.66 0/0:29:29,0:40:-0,-8.97,-81.55 0/0:97:96,0:40:0,-29.09,-268.92 0/0:55:55,0:40:0,-16.78,-154.26 0/0:131:131,0:40:0,-39.6,-366.79 0/0:60:60,0:40:0,-18.28,-168.24 0/0:357:354,0:40:0,-106.58,-990.41 0/0:258:255,0:40:0,-76.84,-713.56 0/0:57:57,0:40:0,-17.38,-159.85 0/0:11:11,0:40:-0,-3.56,-31.22 0/0:263:263,0:40:0,-79.25,-735.93 0/0:167:167,0:40:0,-50.42,-467.47 ./.:.:.:.:. 0/0:56:56,0:40:0,-17.08,-157.06 0/0:45:44,0:40:-0,-13.47,-123.5 0/0:32:32,0:40:-0,-9.87,-89.94 0/0:27:27,0:40:-0,-8.37,-75.96 ./.:.:.:.:. 0/0:55:54,0:40:0,-16.48,-151.46 0/0:48:48,0:40:-0,-14.68,-134.69 0/0:36:36,0:40:-0,-11.07,-101.13 0/0:17:17,0:40:-0,-5.36,-47.99 0/1:16:3,13:40:-31.29,-0,-3.78 0/1:79:62,17:40:-23.55,0,-149.85 0/0:4:4,0:20:-0.01,-1.48,-11.66 0/0:7:7,0:30:-0,-2.36,-20.03 1/1:25:2,23:23:-58.28,-1.73,-0.01 1/1:26:0,26:40:-72.25,-7.61,-0 1/1:154:0,154:40:-430.2,-46.06,0 1/1:295:0,295:40:-824.5,-88.4,-0 1/1:100:0,100:40:-279.19,-29.84,0 0/0:131:131,0:40:0,-39.6,-366.79

And here is the pop file: sel_ariz.txt

Thank you for your help! Qiuyu

ksamuk commented 1 year ago

Hi Qiuyu,

Not sure what has changed, but my guess is that pixy is having trouble with your chromosome column (formatted with | characters). That might not be supported, I'll have to take a look into that.

Also, just FYI, the --bypass_invariant_check option should basically never be used for real data (hence the warning). The calculations will be way off in most cases.

Cheers,

Kieran

jiangqiuqiuu commented 1 year ago

Hi Kieran,

Thank you for your response! Could you suggest any solutions to this problem here? I appreciate your help!

Cheers Qiuyu

ksamuk commented 1 year ago

Hi Qiuyu,

I won't be able to change anything on my end until after Christmas, but you could try stripping the "|quiver|pilon-snap-gene-0.22-mRNA-1" from each of your contig identifiers.

Perhaps something like:

zcat your.vcf.gz | sed 's/|.*|.*//g' | bgzip -c > your.new.vcf.gz
tabix your.new.vcf.gz
ksamuk commented 1 year ago

Hi Qiuyu, did you find a solution for your issue? If not, feel free to reopen this issue.