mfumagalli / ngsTools

Programs to analyse NGS data for population genetics purposes
GNU General Public License v3.0
169 stars 64 forks source link

cannot convert vcf file to beagle binary format using the -vcf-gl option #32

Closed marce-sarrias closed 1 year ago

marce-sarrias commented 1 year ago

Hello,

I did SNP calling with GATK (HaplotypeCaller, CombineGVCFs and GenotypeGVCFs) and as a result I have a final with all snps after applying several filters.

Now I would like to use this matrix of snps to study population structure (do a PCA with ngsCovar) and individual admixture proportions (with NGSadmix). It is my understanding that in order to do this I need to convert my vcf file into beagle binary format, which I could do in angsd using the -vcf-gl option.

I am able to run the command (see below) without getting errors but it seems it is not generating the correct output file.

ANGSD=/path to angsd

REF=genome.fasta.fai

$ANGSD -vcf-gl ./snps.vcf.gz -GL 0 -out ./snps.beagle -fai $REF -doGlf 2 -doMajorMinor 1 -doMaf 1 -nind 43

This is what I see when I check the run log file:

BAD SITE chr24:21289418. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289424. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289425. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289431. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289433. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289470. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289475. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289525. return code:-1 while fetching GL tag rec->rid:23 -> Done reading data waiting for calculations to finish -> Done waiting for threads -> Output filenames: ->"./snps.beagle.arg" ->"./snps.beagle.beagle.gz" ->"./snps.beagle.mafs.gz" -> Mon May 15 15:43:02 2023 -> Arguments and parameters for all analysis are located in .arg file -> Total number of sites analyzed: 0 -> Number of sites retained after filtering: 0 [ALL done] cpu-time used = 134.35 sec [ALL done] walltime used = 135.00 sec

I'm not really sure what the problem is and how to fix it. Any advice on this?

Thank you!!

mfumagalli commented 1 year ago

Hello,

it seems to be something happening with ANGSD and not ngsTools, and therefore I can't provide much support. I'd say that it is an issue of parsing your input files.

Also, we recommend using PCAngsd for PCA-related analysis as it has been shown to outperform ngsCovar under many scenarios.

Matteo


From: marce-sarrias @.> Sent: 17 May 2023 8:39 AM To: mfumagalli/ngsTools @.> Cc: Subscribed @.***> Subject: [mfumagalli/ngsTools] cannot convert vcf file to beagle binary format using the -vcf-gl option (Issue #32)

Hello,

I did SNP calling with GATK (HaplotypeCaller, CombineGVCFs and GenotypeGVCFs) and as a result I have a final with all snps after applying several filters.

Now I would like to use this matrix of snps to study population structure (do a PCA with ngsCovar) and individual admixture proportions (with NGSadmix). It is my understanding that in order to do this I need to convert my vcf file into beagle binary format, which I could do in angsd using the -vcf-gl option.

I am able to run the command (see below) without getting errors but it seems it is not generating the correct output file.

ANGSD=/path to angsd

REF=genome.fasta.fai

$ANGSD -vcf-gl ./snps.vcf.gz -GL 0 -out ./snps.beagle -fai $REF -doGlf 2 -doMajorMinor 1 -doMaf 1 -nind 43

This is what I see when I check the run log file:

BAD SITE chr24:21289418. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289424. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289425. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289431. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289433. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289470. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289475. return code:-1 while fetching GL tag rec->rid:23 BAD SITE chr24:21289525. return code:-1 while fetching GL tag rec->rid:23 -> Done reading data waiting for calculations to finish -> Done waiting for threads -> Output filenames: ->"./snps.beagle.arg" ->"./snps.beagle.beagle.gz" ->"./snps.beagle.mafs.gz" -> Mon May 15 15:43:02 2023 -> Arguments and parameters for all analysis are located in .arg file -> Total number of sites analyzed: 0 -> Number of sites retained after filtering: 0 [ALL done] cpu-time used = 134.35 sec [ALL done] walltime used = 135.00 sec

I'm not really sure what the problem is and how to fix it. Any advice on this?

Thank you!!

— Reply to this email directly, view it on GitHubhttps://github.com/mfumagalli/ngsTools/issues/32, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAQ26COF7HRGC272ZHORCILXGR6DPANCNFSM6AAAAAAYEVVXKU. You are receiving this because you are subscribed to this thread.Message ID: @.***>

marce-sarrias commented 1 year ago

I understand, thank you. Is it possible to run ngsTools using a vcf file? If not, could you advice on how to convert my vcf file into a beagle binary file? Thanks again!

mfumagalli commented 1 year ago

It should be possible to use VCF files in ngsTools as long as they are properly converted to a suitable file containing genotype likelihoods, but I don't have much experience doing that and thus I cannot provide much support.


From: marce-sarrias @.> Sent: 17 May 2023 11:06 AM To: mfumagalli/ngsTools @.> Cc: Matteo Fumagalli @.>; Comment @.> Subject: Re: [mfumagalli/ngsTools] cannot convert vcf file to beagle binary format using the -vcf-gl option (Issue #32)

I understand, thank you. Is it possible to run ngsTools using a vcf file? If not, could you advice on how to convert my vcf file into a beagle binary file? Thanks again!

— Reply to this email directly, view it on GitHubhttps://github.com/mfumagalli/ngsTools/issues/32#issuecomment-1551112740, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAQ26CJOCSTSIDVPHV5EZUDXGSPLBANCNFSM6AAAAAAYEVVXKU. You are receiving this because you commented.Message ID: @.***>

mfumagalli commented 1 year ago

closing as it is not related to ngsTools but ANGSD