aehrc / VariantSpark

machine learning for genomic variants
http://bioinformatics.csiro.au/variantspark
Other
141 stars 45 forks source link

How to use vcf file? #216

Closed cmorris2945 closed 2 years ago

cmorris2945 commented 2 years ago

I am trying to use the .vcf file in Databricks, but I am not seeing anywhere in the readme file on how to USE to vcf file?

How do you use the vcf file specifically?

rocreguant commented 2 years ago

We use Hail to load and manipulate the data. But to pass the data to VariantSpark we transform it into a genotype matrix using mt.GT.n_alt_alleles()

cmorris2945 commented 2 years ago

Hello. Can send the documentation over for that function in python?? Unless is it part of 'hail'?

rocreguant commented 2 years ago

That is Hail

cmorris2945 commented 2 years ago

ok understood