Closed littletiger311 closed 2 years ago
Hello littletiger311,
The genotypes of the variants in your VCF file are diploids. So you need to change the parameter "-p 4" to "-p 2".
Is this still the Z. japonica data? If yes, it is an allotetraploid, so it is fine to run it in diploid mode, and that was what we did in the paper. If you want to run it in tetraploid mode, you need to recall the genotype as tetraploid with allele depth. One option is the R package "updog" by Gerard et. al.
Best, Chenxi
Dear Dr. Zhou , Thank you for your timely reply. Yes, it is still the allotetraploid Z.japonica, so I changed ploidy to diploid (-p 2). The Array Index Out Of Bounds error persists (see below). Is there anything wrong with my vcf file, which was produced by Stacks 2.59.
The running log also said "[WARN ] 2021-10-15 11:25:18.536 [main] Executor - No DP field in VCF file. Filtering by SNP allele depth disabled.". I don't quite understand it well, as the vcf file provides "GT:DP:AD:GQ:GL" for every sample.
Thank you for your time and instruction.
[INFO ] 2021-10-15 11:25:18.531 [main] Executor - STEP 01 prepare data [WARN ] 2021-10-15 11:25:18.536 [main] Executor - No DP field in VCF file. Filtering by SNP allele depth disabled. [INFO ] 2021-10-15 11:25:23.408 [main] Executor - #Filtered by Multi-allelic: 0 [INFO ] 2021-10-15 11:25:23.408 [main] Executor - #Filtered by Quality : 0 [INFO ] 2021-10-15 11:25:23.408 [main] Executor - #Filtered by MAF : 4344 [INFO ] 2021-10-15 11:25:23.408 [main] Executor - #Filtered by Allele Depth : 0 [INFO ] 2021-10-15 11:25:23.409 [main] Executor - #Filtered by Missing : 0 [INFO ] 2021-10-15 11:25:23.409 [main] Executor - --------------------------- [INFO ] 2021-10-15 11:25:23.409 [main] Executor - #Filtered Total : 4344 [INFO ] 2021-10-15 11:25:30.648 [main] Executor - STEP 02 infer single-point haplotypes [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-1] Haplotyper - Random seed - 1899593212805299 [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-4] Haplotyper - Random seed - 1899593212805299 [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-7] Haplotyper - Random seed - 1899593212805299 [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-2] Haplotyper - Random seed - 1899593212805299 [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-6] Haplotyper - Random seed - 1899593212805299 LYoutGT3/out1.zip LYoutGT3/out1.zip LYoutGT3/out1.zip LYoutGT3/out1.zip [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-3] Haplotyper - Random seed - 1899593212805299 LYoutGT3/out1.zip LYoutGT3/out1.zip [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-5] Haplotyper - Random seed - 1899593212805299 [INFO ] 2021-10-15 11:25:30.670 [pool-2-thread-8] Haplotyper - Random seed - 1899593212805299 LYoutGT3/out1.zip LYoutGT3/out1.zip [INFO ] 2021-10-15 11:25:30.828 [pool-2-thread-7] Haplotyper - => STAGE I. training emission model with no transitions allowed. [INFO ] 2021-10-15 11:25:30.829 [pool-2-thread-6] Haplotyper - => STAGE I. training emission model with no transitions allowed. [INFO ] 2021-10-15 11:25:30.831 [pool-2-thread-3] Haplotyper - => STAGE I. training emission model with no transitions allowed. [INFO ] 2021-10-15 11:25:30.834 [pool-2-thread-2] Haplotyper - => STAGE I. training emission model with no transitions allowed. [INFO ] 2021-10-15 11:25:30.837 [pool-2-thread-1] Haplotyper - => STAGE I. training emission model with no transitions allowed. Exception in thread "pool-2-thread-7" Exception in thread "pool-2-thread-2" Exception in thread "pool-2-thread-3" Exception in thread "pool-2-thread-6" Exception in thread "pool-2-thread-1" java.lang.ArrayIndexOutOfBoundsException: 1 [INFO ] 2021-10-15 11:25:30.838 [pool-2-thread-5] Haplotyper - => STAGE I. training emission model with no transitions allowed. at cz1.hmm.model.EmissionModel.makeObUnits(EmissionModel.java:238)
Hello littletiger311,
The warning message "No DP field in VCF file" is because it expects a "DP" field in the INFO column. It is fine - just skipped the filtering by total allele depth.
For the error, I am sure what went wrong. I am happy to have a check if you could share the output files - either in this thread or send me by email at chnx.zhou@gmail.com.
Chenxi
Dr. Chen, I have sent you the files by email. Thank you very much for your help. LT
Dear Dr. Zhou ,
I got the error of "java.lang.ArrayIndexOutOfBoundsException: 1" when I used gembler or haplotyper with '-G' option. The error message is below: [INFO ] 2021-10-14 16:35:13.373 [main] Haplotyper - Random seed - 1831775920809742 dataprepare/populations.snps.recode.zip [INFO ] 2021-10-14 16:35:13.412 [main] Haplotyper - => STAGE I. training emission model with no transitions allowed. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at cz1.hmm.model.EmissionModel.makeObUnits(EmissionModel.java:238) at cz1.hmm.model.EmissionModel.initialise(EmissionModel.java:193) at cz1.hmm.model.EmissionModel.(EmissionModel.java:92)
at cz1.hmm.model.ModelTrainer.(ModelTrainer.java:26)
at cz1.hmm.tools.Haplotyper.run(Haplotyper.java:249)
at cz1.appl.PolyGembler.main(PolyGembler.java:50)
I' really appreciate it for any help on this.
The command running gembler is "java -jar dist/polyGembler-1.1-jar-with-dependencies.jar gembler -i populationout/populations.snps.vcf -l 10 -f 0.1 -m 0.5 -G -a scaf/PA.fasta -o PAoutGT -p 4 -parent Sample1:Sample2 -t 8"
The command running haplotype is "java -jar dist/polyGembler-1.1-jar-with-dependencies.jar haplotyper -i dataprepare/populations.snps.recode.zip -o haplo -G -c ctg000020 -ex test --parent Sample1:Sample2"
I also attached few lines of the vcf file below.
fileformat=VCFv4.2
fileDate=20211001
source="Stacks v2.59"
INFO=
INFO=
INFO=
INFO=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
INFO=
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample46 Sample42 Sample2 Sample22 Sample44 Sample56 Sample19 Sample39 Sample57 Sample15 Sample8 Sample55 Sample21 Sample52 Sample36 Sample61 Sample24 Sample18 Sample60 Sample14 Sample5 Sample58 Sample4 Sample35 Sample10 Sample28 Sample20
ctg000000 78710 48:85:+ C G . PASS NS=46;AF=0.304 GT:DP:AD:GQ:GL 0/0:57:57,0:40:0.00,-17.76,-244.33 1/1:119:0,119:40:-508.89,-36.03,-0.00 0/0:84:84,0:40:0.00,-25.89,-359.88 1/1:14:0,14:40:-59.53,-4.43,-0.00 0/0:34:34,0:40:-0.00,-10.84,-145.90 0/1:87:41,46:40:-170.07,0.00,-149.06 0/0:45:45,0:40:-0.00,-14.15,-192.97 0/0:29:29,0:40:-0.00,-9.33,-124.50 0/0:8:8,0:37:-0.00,-3.01,-34.63 0/0:15:15,0:40:-0.00,-5.12,-64.58 0/0:23:23,0:40:-0.00,-7.53,-98.82 0/0:38:38,0:40:-0.00,-12.04,-163.02 0/0:1:1,0:13:-0.05,-0.96,-4.72 1/1:26:0,26:40:-110.88,-8.04,-0.00 ./. 1/1:20:0,20:40:-85.20,-6.23,-0.00 0/0:22:22,0:40:-0.00,-7.23,-94.54 0/0:19:19,0:40:-0.00,-6.32,-81.70 1/1:19:1,18:22:-72.37,-1.66,-0.01 0/0:17:17,0:40:-0.00,-5.72,-73.14 1/1:23:0,23:40:-98.04,-7.14,-0.00 0/1:34:17,17:40:-61.92,0.00,-62.31 ./. 0/0:17:17,0:40:-0.00,-5.72,-73.14 0/0:7:7,0:33:-0.00,-2.71,-30.35 0/0:13:13,0:40:-0.00,-4.52,-56.02 0/0:9:9,0:40:-0.00,-3.31,-38.91 1/1:11:0,11:40:-46.69,-3.53,-0.00 1/1:11:0,11:40:-46.69,-3.53,-0.00 1/1:14:0,14:40:-59.53,-4.43,-0.00 0/0:9:9,0:40:-0.00,-3.31,-38.91 ./. 0/1:22:8,14:40:-52.69,0.00,-27.40 ./. ./. 0/0:7:7,0:33:-0.00,-2.71,-30.35 ./. 0/1:12:10,2:40:-4.34,-0.00,-38.97 0/0:8:8,0:37:-0.00,-3.01,-34.63 ./. ./. ./. ./. 0/0:8:8,0:37:-0.00,-3.01,-34.63 0/0:5:5,0:27:-0.00,-2.11,-21.79 0/1:19:7,12:40:-45.03,0.00,-24.02 0/0:8:8,0:37:-0.00,-3.01,-34.63 ./. 1/1:8:0,8:32:-33.85,-2.62,-0.00 ./. 0/0:1:1,0:13:-0.05,-0.96,-4.72 ./. 0/0:3:3,0:20:-0.01,-1.52,-13.24 0/1:8:7,1:18:-1.29,-0.02,-27.36 1/1:3:0,3:16:-12.48,-1.15,-0.03 0/0:10:10,0:40:-0.00,-3.61,-43.19 ./. 0/0:1:1,0:13:-0.05,-0.96,-4.72 0/0:3:3,0:20:-0.01,-1.52,-13.24 0/0:1:1,0:13:-0.05,-0.96,-4.72 ./. ./.
ctg000000 78711 48:86:+ A G . PASS NS=46;AF=0.304 GT:DP:AD:GQ:GL 0/0:57:57,0:40:0.00,-17.76,-244.33 1/1:119:0,119:40:-508.89,-36.03,-0.00 0/0:84:84,0:40:0.00,-25.89,-359.88 1/1:14:0,14:40:-59.53,-4.43,-0.00 0/0:34:34,0:40:-0.00,-10.84,-145.90 0/1:87:41,46:40:-170.07,0.00,-149.06 0/0:45:45,0:40:-0.00,-14.15,-192.97 0/0:29:29,0:40:-0.00,-9.33,-124.50 0/0:8:8,0:37:-0.00,-3.01,-34.63 0/0:15:15,0:40:-0.00,-5.12,-64.58 0/0:23:23,0:40:-0.00,-7.53,-98.82 0/0:38:38,0:40:-0.00,-12.04,-163.02 0/0:1:1,0:13:-0.05,-0.96,-4.72 1/1:26:0,26:40:-110.88,-8.04,-0.00 ./. 1/1:20:0,20:40:-85.20,-6.23,-0.00 0/0:22:22,0:40:-0.00,-7.23,-94.54 0/0:19:19,0:40:-0.00,-6.32,-81.70 1/1:19:1,18:22:-72.37,-1.66,-0.01 0/0:17:17,0:40:-0.00,-5.72,-73.14 1/1:23:0,23:40:-98.04,-7.14,-0.00 0/1:34:17,17:40:-61.92,0.00,-62.31 ./. 0/0:17:17,0:40:-0.00,-5.72,-73.14 0/0:7:7,0:33:-0.00,-2.71,-30.35 0/0:13:13,0:40:-0.00,-4.52,-56.02 0/0:9:9,0:40:-0.00,-3.31,-38.91 1/1:11:0,11:40:-46.69,-3.53,-0.00 1/1:11:0,11:40:-46.69,-3.53,-0.00 1/1:14:0,14:40:-59.53,-4.43,-0.00 0/0:9:9,0:40:-0.00,-3.31,-38.91 ./. 0/1:22:8,14:40:-52.69,0.00,-27.40 ./. ./. 0/0:7:7,0:33:-0.00,-2.71,-30.35 ./. 0/1:12:10,2:40:-4.34,-0.00,-38.97 0/0:8:8,0:37:-0.00,-3.01,-34.63 ./. ./. ./. ./. 0/0:8:8,0:37:-0.00,-3.01,-34.63 0/0:5:5,0:27:-0.00,-2.11,-21.79 0/1:19:7,12:40:-45.03,0.00,-24.02 0/0:8:8,0:37:-0.00,-3.01,-34.63 ./. 1/1:8:0,8:32:-33.85,-2.62,-0.00 ./. 0/0:1:1,0:13:-0.05,-0.96,-4.72 ./. 0/0:3:3,0:20:-0.01,-1.52,-13.24 0/1:8:7,1:18:-1.29,-0.02,-27.36 1/1:3:0,3:16:-12.48,-1.15,-0.03 0/0:10:10,0:40:-0.00,-3.61,-43.19 ./. 0/0:1:1,0:13:-0.05,-0.96,-4.72 0/0:3:3,0:20:-0.01,-1.52,-13.24 0/0:1:1,0:13:-0.05,-0.96,-4.72 ./. ./.
ctg000000 123700 84:94:- T C . PASS NS=43;AF=0.105 GT:DP:AD:GQ:GL 0/0:56:56,0:40:0.00,-17.32,-228.08 0/1:51:40,11:40:-28.65,0.00,-147.58 0/0:43:43,0:40:-0.00,-13.41,-175.53 ./. 0/0:13:13,0:40:-0.00,-4.38,-54.24 ./. 0/0:32:32,0:40:-0.00,-10.10,-131.05 0/0:15:15,0:40:-0.00,-4.98,-62.32 0/0:15:14,1:14:-0.05,-0.99,-54.28 0/0:23:23,0:40:-0.00,-7.39,-94.67 0/0:13:13,0:40:-0.00,-4.38,-54.24 0/0:26:26,0:40:-0.00,-8.29,-106.80 0/0:2:2,0:15:-0.04,-1.11,-9.80 0/1:14:8,6:40:-19.57,0.00,-29.34 ./. 0/1:7:3,4:40:-13.60,-0.00,-11.23 0/0:7:7,0:32:-0.00,-2.58,-29.98 0/0:4:4,0:22:-0.01,-1.68,-17.86 ./. 0/0:21:21,0:40:-0.00,-6.79,-86.58 0/1:7:4,3:40:-9.55,-0.00,-15.28 0/1:9:7,2:40:-4.91,-0.00,-26.80 0/0:4:4,0:22:-0.01,-1.68,-17.86 0/0:6:6,0:29:-0.00,-2.28,-25.94 0/0:12:12,0:40:-0.00,-4.08,-50.19 0/0:5:5,0:25:-0.00,-1.98,-21.90 0/0:12:12,0:40:-0.00,-4.08,-50.19 0/1:4:2,2:40:-6.41,-0.00,-8.09 0/0:3:3,0:19:-0.02,-1.39,-13.83 0/1:3:1,2:40:-6.71,-0.00,-4.35 0/0:5:5,0:25:-0.00,-1.98,-21.90 ./. 0/0:6:6,0:29:-0.00,-2.28,-25.94 0/0:3:3,0:19:-0.02,-1.39,-13.83 ./. 0/0:2:2,0:15:-0.04,-1.11,-9.80 ./. 0/1:6:4,2:40:-5.81,-0.00,-15.58 0/0:3:3,0:19:-0.02,-1.39,-13.83 0/0:6:6,0:29:-0.00,-2.28,-25.94 ./. ./. ./. ./. 0/0:4:4,0:22:-0.01,-1.68,-17.86 0/1:11:8,3:40:-8.35,-0.00,-30.24 0/0:10:10,0:40:-0.00,-3.48,-42.11 ./. ./. ./. 0/0:4:4,0:22:-0.01,-1.68,-17.86 ./. 0/0:2:2,0:15:-0.04,-1.11,-9.80 0/0:4:4,0:22:-0.01,-1.68,-17.86 ./. 0/0:3:3,0:19:-0.02,-1.39,-13.83 ./. ./. 0/0:5:5,0:25:-0.00,-1.98,-21.90 0/0:2:2,0:15:-0.04,-1.11,-9.80 ./. 0/0:2:2,0:15:-0.04,-1.11,-9.80