clingen-data-model / clingen-hail-reports

Performs filtering against gnomAD and ClinVar datasets. Uses Hail to report records with a population FAF above certain thresholds by gene.
1 stars 0 forks source link

Add GA4GH GKS normalization notebook #10

Closed theferrit32 closed 1 year ago

theferrit32 commented 1 year ago

My current plan here is to have a jupyter notebook that will automate loading and starting a seqrepo rest server inside a hail dataproc cluster and configuring vrs-python to use it, and will enable doing live normalizations of simple alleles from a hail table. The vrs-python calls on the hail workers will communicate with the seqrepo rest service on the dataproc master node. Depending on the specifics of what alleles are in the hail table and how big the table is, doing this through hail's parallel computation of expression fields might run into synchronization related exceptions in the seqrepo rest service.

https://github.com/biocommons/seqrepo-rest-service/issues/12

larrybabb commented 1 year ago

This is a duplicate of https://app.zenhub.com/workspaces/genegraphdxclinvar-60340fb9898dae001107e94e/issues/gh/ga4gh/va-spec/102