Closed evyamor closed 7 months ago
Hi @evyamor,
You can surely use your own VCF, like ones created by DRAGEN, to run PyPGx.
You only need 'depth-of-coverage' and 'control-statistics' files if you want to perform structural variation detection for complex genes such as CYP2D6.
I will list below some resources that might be useful to you:
Let me know if you have further questions.
Thank you for the fast response!
I will test PYPGX on VCFs created via DRAGEN ( and separately also on BAM files I create ) , I wanted to know if there is a need of GVCF, or VCF containing only all SNVs\indels.
Also, as stated here: 'users had been instructed to create input VCF file from BAM files on their own using a variant caller of their choice (e.g. GATK4, bcftools, DRAGEN, DeepVariant). This can raise several potential problems such as decreased reproducibility of PyPGx results and users providing incorrectly formatted VCF to PyPGx.' I wanted to know if there is a specific structure requirement in PyPGx for the VCF for decreasing these potential problems. ( Here I refer mostly to the fields under the FORMAT column; ( DB, VDB, SGB...... )
chr1 47261780 . T C 235.707 PASS DP=1519;VDB=0.326231;SGB=-40.8249;RPBZ=0.398415;MQBZ=-15.2308;MQSBZ=0.889911;BQBZ=-10.8447;SCBZ=0.105486;FS=0;MQ0F=0;AC=120;AN=140;DP4=205,13,1153,122;MQ=49 GT:PL:AD 0/0:0,57,255:19,0 0/1:204,0,172:10,11 1/1:240,45,0:0,15 0/1:147,0,165:11,10 1/1:246,54,0:0,18 1/1:255,66,0:0,22 0/1:134,0,182:15,9 1/1:255,87,0:0,29 1/1:231,54,0:0,18 1/1:224,57,0:0,19 1/1:248,36,0:0,12 0/1:120,0,176:9,7 1/1:255,54,0:0,18 1/1:198,75,0:0,25 0/1:168,0,127:7,12 1/1:255,57,0:0,19 0/1:105,0,183:9,5 1/1:223,51,0:0,17 1/1:255,63,0:0,21 1/1:255,80,0:1,31 1/1:189,60,0:0,20 0/1:148,0,214:10,12 1/1:191,45,0:0,15 0/1:98,0,175:15,6 1/1:255,69,0:0,23 0/1:158,0,100:7,16 0/1:161,0,114:5,12 0/1:255,0,138:9,14 1/1:247,81,0:0,27 1/1:227,57,0:0,19 1/1:255,63,0:0,21 1/1:255,69,0:0,23 1/1:255,75,0:0,25 1/1:255,84,0:0,28 0/1:202,0,190:14,15 1/1:224,69,0:0,23 1/1:255,66,0:0,22 1/1:255,63,0:0,21 1/1:255,39,0:0,13 1/1:255,51,0:0,17 1/1:255,72,0:0,24 1/1:231,63,0:0,21 1/1:255,78,0:0,26 1/1:255,75,0:0,25 0/1:145,0,227:16,10 1/1:200,72,0:0,24 1/1:205,72,0:0,24 1/1:207,66,0:0,22 0/1:109,0,172:12,8 0/1:174,0,135:9,14 1/1:255,66,0:0,22 1/1:255,45,0:0,15 1/1:249,54,0:0,18 1/1:255,54,0:0,18 1/1:230,72,0:0,24 1/1:247,63,0:0,21 1/1:211,81,0:0,27 1/1:255,54,0:0,18 0/1:167,0,193:13,13 1/1:255,72,0:0,24 0/1:76,0,159:11,4 1/1:236,66,0:0,22 1/1:255,78,0:0,26 1/1:218,45,0:0,15 1/1:255,60,0:0,20 1/1:255,66,0:0,22 1/1:202,78,0:0,26 1/1:255,81,0:0,27 0/1:181,0,176:16,11 1/1:231,33,0:0,11
I have also completed the tutorial with the given files successfully ( Only needed to re-index the files which can be done easily ) Thank you for this comprehended tutorial as well. Much appreciated, Have a wonderful day, Evyatar
Hi, Can I use PYPGX with a VCF created outside of it, like one generated by DRAGEN? Do I still need 'depth-of-coverage' and 'control-statistics' files? Specifically, I'm curious about using the pipeline directly on a non-PYPGX VCF. Are additional files necessary, or can I work with just the VCF? Also, are there specific data requirements for the VCF itself ( its structure ) ? Thanks for your help, Evyatar