OpenBioLink / OpenBioLink

OpenBioLink is a resource and evaluation framework for evaluating link prediction models on heterogeneous biomedical graph data.
MIT License
142 stars 23 forks source link

Improving storage of graph metadata #66

Open nomisto opened 3 years ago

nomisto commented 3 years ago

Currently the format for storing metadata of the graph (database, infiles, edges, ...) in the graph creation module is a little confusing. Maybe looking into storing metadata of the sources, edges, etc. in a more centralized format (f.e. json) could improve readability. Further as sources can change at anytime (seperator, urls, ...) this would help in adatpting to these changes faster and would make including new sources/edges easier.

matthias-samwald commented 3 years ago

Do you think we could standardize everything to RDF?

nomisto commented 3 years ago

Yes that might be even better than JSON. With RDF we could provide more readability than the current method (storing metadata in seperate static python classes), while still having the modularity of the current method.

matthias-samwald commented 3 years ago

I guess than we should try to transition to RDF! One potential problem is of course that sometimes the triples make it a bit cumbersome to capture N-ary relationship with N>3. But that is probably not a problem for the dataset descriptions.