From Kent:
we're not loading from VCF files although we talked about doing this ages ago, clinvar publishes vcf, gnomad does, sure many others do
right now variants are pretty basic, theres an identifier and sometimes a type (probably structural vs functional)
we link to the gene(s) that the source indicates are related, but there are some nuances with clinvar and gwas catalog around this
for example, we use a different relationship for upstream/downstream for gwas catalog, intron/exon are treated the same though
From Kent: we're not loading from VCF files although we talked about doing this ages ago, clinvar publishes vcf, gnomad does, sure many others do right now variants are pretty basic, theres an identifier and sometimes a type (probably structural vs functional) we link to the gene(s) that the source indicates are related, but there are some nuances with clinvar and gwas catalog around this for example, we use a different relationship for upstream/downstream for gwas catalog, intron/exon are treated the same though
there's some background to modelling locations and coordinates as RDF and OWL (faldo, monochrom) I think it's a little over-modelled for many of our use cases https://github.com/OBF/FALDO https://github.com/monarch-initiative/monochrom we do a have a feature location index in solr, https://solr-dev.monarchinitiative.org/solr/feature-location/select/?q=*:*&wt=json , we even had an undergrad student build a front end widget around this as a summer project, but all this work fell off