Closed bodkan closed 6 years ago
Is it too crazy and dirty to just write a small function that will call to bcftools
to do all the parsing and formatting? The query
subcommand is quite powerful, it should be possible to use this to generate the "geno" and "snp" files just using this.
I think the best approach is to use readr
's or data.table
's fast tab-separated file processing functionality.
That way, I won't have to rely on:
bcftools
Closed by 9291b694e42af384bfba0b618ebc11b77d1b6556
There is no standardised way how to convert VCF data into the EIGENSTRAT format in the ADMIXTOOLS package. It would be helpful to have that.
Is there a reasonably fast and efficient VCF parsing library for R that I could use?
The least I could do is to load the genotype matrix and use all samples. This would rely on the user to provide an already filtered VCF file, but that is not a problem I think. This package should only do the things related to ADMIXTOOLS and nothing else.