RobokopU24 / Feedback

Feedback on the ROBOKOP project
https://robokop.renci.org
0 stars 0 forks source link

Metadata to be displayed #126

Closed EvanDietzMorris closed 1 year ago

EvanDietzMorris commented 1 year ago

Opening a dialogue for which pieces of information we want to extract and display from plater /metadata endpoints.

The metadata file is structured with top level properties describing the KG itself: Graph Version: { "graph_version": 123123123 } Build time: { "build_time": "03-17-23 08:06:41" } Final node and edge counts: { "final_node_count": 8660793, "final_edge_count": 132150678 }

And lists of "sources" and "subgraphs" within the KG.

Each source has: source_version - the version of the source data provided by the source or a date that represents it parsing_version - the version of the parser code used to convert the source data into knowledge graph triplets normalization_version - a composite version of the different normalization services and techniques used on this source release_version - a hashed id that represents a composite of the previously mentioned attributes merge_strategy - the algorithm/strategy used for merging this source into the KG provenance - the biolink compliant infores id that best describes the provenance for this source (right now this is a single value but it should really be a list for cases where multiple infores sources are found within one source) description, source_data_url, license, attribution - hand curated text from ORION for each source a list of the files used in the final KG and node and edge counts - this format isn't great in retrospect and @EvanDietzMorris will be working on cleaning it up

Each subgraph: is a json dictionary object that shares the same schema/structure as the metadata for the whole KG, and contains the same top level properties mentioned above (graph version, node counts etc). We should be able to handle multiple layers of nested subgraphs. A decision should be made by the group whether we want to display information about subgraphs, or whether we'd prefer to flatten the subgraphs and extract their sources, showing them parallel to top level sources.

cbizon commented 1 year ago

We want to see

Woozl commented 1 year ago

This data is now exposed on the service and is viewable on the api docs [^1]

[^1]: relevant PR