bodkan / admixr

An R package for reproducible and automated ADMIXTOOLS analyses
https://bodkan.net/admixr
Other
28 stars 9 forks source link

Import from VCF into EIGENSTRAT #6

Closed bodkan closed 6 years ago

bodkan commented 7 years ago

There is no standardised way how to convert VCF data into the EIGENSTRAT format in the ADMIXTOOLS package. It would be helpful to have that.

Is there a reasonably fast and efficient VCF parsing library for R that I could use?

The least I could do is to load the genotype matrix and use all samples. This would rely on the user to provide an already filtered VCF file, but that is not a problem I think. This package should only do the things related to ADMIXTOOLS and nothing else.

bodkan commented 7 years ago

Is it too crazy and dirty to just write a small function that will call to bcftools to do all the parsing and formatting? The query subcommand is quite powerful, it should be possible to use this to generate the "geno" and "snp" files just using this.

bodkan commented 7 years ago

I think the best approach is to use readr's or data.table's fast tab-separated file processing functionality.

That way, I won't have to rely on:

bodkan commented 6 years ago

Closed by 9291b694e42af384bfba0b618ebc11b77d1b6556