Open cmungall opened 9 years ago
we already have basically built this for genomic features (which includes genes and variants), and it's already loaded into a special golr schema to include faldo positional information and properties. would this suffice? should we just add to this? @ccondit can you list what we have in that schema? good idea to add description and synonyms.
for genes, they also need strand (which we haven't put in yet).
schema is here: http://geoffrey.crbs.ucsd.edu:8080/solr/feature-location/admin/file/?contentType=text/xml;charset=utf-8&file=schema.xml
i actually didn't generate it using golr - just made it by hand.
On Tue, Jun 9, 2015 at 2:30 PM Nicole Washington notifications@github.com wrote:
for genes, they also need strand (which we haven't put in yet).
— Reply to this email directly or view it on GitHub https://github.com/SciGraph/golr-loader/issues/6#issuecomment-110509214.
I think features can remain a special case with special code for now,
The current loader is association-centric; i.e. each document is a relationship between two objects. This is useful for the majority of queries.
It would also be useful to have an object-centric loader. (TBD: define yaml in monarch repo). E.g. for https://github.com/monarch-initiative/monarch-app/issues/756 (one row per variant).
Here we would have only one document per object. Relationships would be loaded into a multi-valued field named after the property. This means we have a less generic schema than oban.
Example fields (core):
This would be extended depending on the object type. E.g. for genomic features like variants we may have chrom, start, end (for simplicitly we would flatten to a single reference; for more complexity use cypher). For variants we may have a pathogenicity score. Etc.
@cmungall will define core schema, cc @nlwashington
Note that the mechanism here could be used to load ontology classes; but may as well just use owltools loader for this (cc @hdietze @kltm)