grenaud / glactools

command-line tools for the management of genotype likelihoods and allele counts
http://grenaud.github.io/glactools/
GNU General Public License v3.0
29 stars 2 forks source link

vcf to GROSS convertion #23

Closed mariels closed 3 years ago

mariels commented 3 years ago

Hello,

Thank you for a very useful program. I have been trying to generated a GROSS input file using glactools and the following tutorial: https://github.com/FerRacimo/GRoSS/blob/master/VCFtoGRoSS.md

I have tried the following command: glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf - |glactools meld -f panel3.txt - | glactools acf2gross --noroot - |gzip > output.gross.gz

And I received the following error messages: Error: GlacParser tried to read 4 bytes but got 0 Error: GlacParser tried to read 4 bytes but got 0

I have also tried this and received the same errors messages: tabix -h input.vcf |glactools vcfm2acf --onlyGT --fai ref.fasta.fai - |glactools meld -f panel3.txt - | glactools acf2gross --noroot - |gzip > output.gross.gz

I've tried the following command to convert the vcf file to an acf file and it worked fine. glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf > output.acf.gz

The file looks as follow: glactools view -h check.acf.gz|head `Warning: No EOF marker, likely due to an I/O error

chr coord REF,ALT root anc SP1060 KW IR23 IR24 IR30 IR33 IR34 IR36 S10 S12 S27 S31 S37 S38 S39 S40 S41 S42 S43 S53 S54 S56 14Tt001 14Tt004 14Tt006 14Tt021 14Tt022 14Tt023 14Tt025 7Tt156 7Tt161 7Tt182 7Tt193 7Tt270 7Tt278 7Tt282 7Tt284 7Tt287 7Tt350

MRVK01001050.1 1309 G,A 0,0:0 0,0:0 0,0:0 2,0:0 0,2:0 0,2:0 1,1:0 2,0:0 2,0:0 2,0:0 2,0:0 1,1:0 1,1:0 2,0:0 1,1:0 2,0:0 2,0:0 1,1:0 2,0:0 2,0:0 0,0:0 2,0:0 1,1:0 2,0:0 2,0:0 0,0:0 2,0:0 2,0:0 2,0:0 2,0:0 2,0:0 0,0:0 2,0:0 0,0:0 0,2:0 1,1:0 2,0:0 2,0:0 2,0:0 1,1:0 2,0:0`

The vcf file is as follow for the same SNP: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SP1060 KW IR23 IR24 IR30 IR33 IR34 IR36 S10 S12 S27 S31 S37 S38 S39 S40 S41 S42 S43 S53 S54 S56 14Tt001 14Tt004 14Tt006 14Tt021 14Tt022 14Tt023 14Tt025 7Tt156 7Tt161 7Tt182 7Tt193 7Tt270 7Tt278 7Tt282 7Tt284 7Tt287 7Tt350MRVK01001050.1 1309 . G A 999 . DP=803;VDB=0.353011;SGB=99.994;RPB=0.358603;MQB=1;MQSB=1;BQB=0.917396;MQ0F=0;ICB=0.411676;HOB=0.0799516;AC=14;AN=68;DP4=299,323,78,72;MQ=60 GT:PL:DP:GP:GQ ./.:0,0,0:0:0,0,0:0 0/0:0,27,242:9:0,30,254:30 1/1:255,54,0:18:242,44,0:44 1/1:255,33,0:11:242,23,0:23 0/1:105,0,224:11:101,0,233:99 0/0:0,54,255:18:0,57,267:57 0/0:0,36,255:12:0,39,267:39 0/0:0,57,255:19:0,60,267:60 0/0:0,69,255:23:0,72,267:72 0/1:194,0,201:15:190,0,210:99 0/1:196,0,255:18:192,0,264:99 0/0:0,33,255:11:0,36,267:36 0/1:50,0,178:8:46,0,187:46 0/0:0,48,255:16:0,51,267:51 0/0:0,33,255:11:0,36,267:36 0/1:233,0,255:19:229,0,264:99 0/0:0,48,255:16:0,51,267:51 0/0:0,36,255:12:0,39,267:39 ./.:229,27,0:9:216,17,0:17 0/0:0,42,255:14:0,45,267:45 0/1:121,0,175:11:117,0,184:99 0/0:0,54,255:18:0,57,267:57 0/0:0,48,255:16:0,51,267:51 ./.:0,9,255:23:0,12,267:12 0/0:0,45,255:15:0,48,267:48 0/0:0,42,255:14:0,45,267:45 0/0:0,51,255:17:0,54,267:54 0/0:0,45,255:15:0,48,267:48 0/0:0,60,255:20:0,63,267:63 ./.:255,20,0:14:242,11,0:11 0/0:0,36,237:12:0,39,249:39 ./.:241,0,8:11:237,0,17:17 1/1:255,39,0:13:242,29,0:29 0/1:163,0,207:14:159,0,216:99 0/0:0,51,255:17:0,54,267:54 0/0:0,33,255:11:0,36,267:36 0/0:0,66,255:22:0,69,267:69 0/1:242,0,178:16:238,0,187:990/0:0,33,255:11:0,36,267:36

The first sample is ancient and has more missing genotype than the others.

Could you please help? Thank you very much,

Marie

grenaud commented 3 years ago

Dear Marie, Maybe I am missing something but "glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf -" cannot be executed alone, the "-" means /dev/stdin which comes from the tabix output.

On Tue, Sep 21, 2021 at 2:01 AM mariels @.***> wrote:

Hello,

Thank you for a very useful program. I have been trying to generated a GROSS input file using glactools and the following tutorial: https://github.com/FerRacimo/GRoSS/blob/master/VCFtoGRoSS.md

I have tried the following command: glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf - |glactools meld -f panel3.txt - | glactools acf2gross --noroot - |gzip > output.gross.gz

And I received the following error messages: Error: GlacParser tried to read 4 bytes but got 0 Error: GlacParser tried to read 4 bytes but got 0

I have also tried this and received the same errors messages: tabix -h input.vcf |glactools vcfm2acf --onlyGT --fai ref.fasta.fai - |glactools meld -f panel3.txt - | glactools acf2gross --noroot - |gzip > output.gross.gz

I've tried the following command to convert the vcf file to an acf file and it worked fine. glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf > output.acf.gz

The file looks as follow: glactools view -h check.acf.gz|head

chr coord REF,ALT root anc SP1060 KW IR23 IR24 IR30 IR33 IR34 IR36 S10 S12 S27 S31 S37 S38 S39 S40 S41 S42 S43 S53 S54 S56 14Tt001 14Tt004 14Tt006 14Tt021 14Tt022 14Tt023 14Tt025 7Tt156 7Tt161 7Tt182 7Tt193 7Tt270 7Tt278 7Tt282 7Tt284 7Tt287 7Tt350

MRVK01001050.1 1309 G,A 0,0:0 0,0:0 0,0:0 2,0:0 0,2:0 0,2:0 1,1:0 2,0:0 2,0:0 2,0:0 2,0:0 1,1:0 1,1:0 2,0:0 1,1:0 2,0:0 2,0:0 1,1:0 2,0:0 2,0:0 0,0:0 2,0:0 1,1:0 2,0:0 2,0:0 0,0:0 2,0:0 2,0:0 2,0:0 2,0:0 2,0:0 0,0:0 2,0:0 0,0:0 0,2:0 1,1:0 2,0:0 2,0:0 2,0:0 1,1:0 2,0:0```

The vcf file is as follow for the same SNP: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SP1060 KW IR23 IR24 IR30 IR33 IR34 IR36 S10 S12 S27 S31 S37 S38 S39 S40 S41 S42 S43 S53 S54 S56 14Tt001 14Tt004 14Tt006 14Tt021 14Tt022 14Tt023 14Tt025 7Tt156 7Tt161 7Tt182 7Tt193 7Tt270 7Tt278 7Tt282 7Tt284 7Tt287 7Tt350MRVK01001050.1 1309 . G A 999 . DP=803;VDB=0.353011;SGB=99.994;RPB=0.358603;MQB=1;MQSB=1;BQB=0.917396;MQ0F=0;ICB=0.411676;HOB=0.0799516;AC=14;AN=68;DP4=299,323,78,72;MQ=60 GT:PL:DP:GP:GQ ./.:0,0,0:0:0,0,0:0 0/0:0,27,242:9:0,30,254:30 1/1:255,54,0:18:242,44,0:44 1/1:255,33,0:11:242,23,0:23 0/1:105,0,224:11:101,0,233:99 0/0:0,54,255:18:0,57,267:57 0/0:0,36,255:12:0,39,267:39 0/0:0,57,255:19:0,60,267:60 0/0:0,69,255:23:0,72,267:72 0/1:194,0,201:15:190,0,210:99 0/1:196,0,255:18:192,0,264:99 0/0:0,33,255:11:0,36,267:36 0/1:50,0,178:8:46,0,187:46 0/0:0,48,255:16:0,51,267:51 0/0:0,33,255:11:0,36,267:36 0/1:233,0,255:19:229,0,264:99 0/0:0,48,255:16:0,51,267:51 0/0:0,36,255:12:0,39,267:39 ./.:229,27,0:9:216,17,0:17 0/0:0,42,255:14:0,45,267:45 0/1:121,0,175:11:117,0,184:99 0/0:0,54,255:18:0,57,267:57 0/0:0,48,255:16:0,51,267:51 ./.:0,9,255:23:0,12,267:12 0/0:0,45,255:15:0,48,267:48 0/0:0,42,255:14:0,45,267:45 0/0:0,51,255:17:0,54,267:54 0/0:0,45,255:15:0,48,267:48 0/0:0,60,255:20:0,63,267:63 ./.:255,20,0:14:242,11,0:11 0/0:0,36,237:12:0,39,249:39 ./.:241,0,8:11:237,0,17:17 1/1:255,39,0:13:242,29,0:29 0/1:163,0,207:14:159,0,216:99 0/0:0,51,255:17:0,54,267:54 0/0:0,33,255:11:0,36,267:36 0/0:0,66,255:22:0,69,267:69 0/1:242,0,178:16:238,0,187:990/0:0,33,255:11:0,36,267:36

The first sample is ancient and has more missing genotype than the others.

Could you please help? Thank you very much,

Marie

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grenaud/glactools/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQRNI4XPHLTPFZZ52OUAPTUC7DPDANCNFSM5ENCV7YA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

mariels commented 3 years ago

Dear Gabriel,

Thanks for your answer. Sorry I got confused, I ran the commands as follow and it worked fine: glactools vcfm2acf --onlyGT --fai ref.fasta.fai input.vcf > check.acf.gz

glactools meld -f panel3.txt check.acf.gz | glactools acf2gross --noroot - |gzip > output.gross.gz

I will close the issue. Best wishes,

Marie