clingen-data-model / clinvar-ingest

Apache License 2.0
2 stars 0 forks source link

Produce VRS 2.0 output for all 2023-10-15 clinvar variants #7

Closed larrybabb closed 12 months ago

larrybabb commented 1 year ago

I need someone to produce the VRS 2.0-alpha records for the vi.json file stored in GCS at clinvar-gk-pilot/2023-10-15/all.

Please upload the output to the same bucket folder with the name output-vi.json compressed or not (doesn't matter).

And notify me when it is available.

This is needed to support the MGB group that is working on an AI pilot that Heidi and I are supporting. They will be using GA4GH VRS ids in their ML and AI datasets. I was asked to give them a mapping between the VRS ids and the Clinvar Variation IDs. We do not need to support 100% of the variants (i.e. Genotypes, Haplotypes, complex SVs and wacky clinvar variants)

larrybabb commented 1 year ago

@toneillbroad I put this in your "Selected For Development" queue. I'm hopeful this is pretty straightforward. If not let me know before you dig in. This isn't urgent, but I would like to simply throw it over to them so we don't have to think about it.

toneillbroad commented 1 year ago

@larrybabb as per your request, two files have been uploaded to the gs://clinvar-gk-pilot/2023-10-15/all/ bucket: output-2023-10-15_all_vi-vrs13.json.gz - VRS 1.3 output output-2023-10-15_all_vi-vrs20.json.gz - VRS 2.0 output

Please let me know if you see any issues.