JULIELab / gepi

GePI (GEne - Protein Interactions) is a web portal for quick and convenient access to gene - protein interaction mentions automatically extracted from the biomedical literature, i.e. PubMed and PubMed Central (Open Access Subset).
GNU General Public License v3.0
1 stars 0 forks source link

Update concept manager for gene ontology correction #225

Closed khituras closed 1 year ago

khituras commented 1 year ago

The underlying issue is that we convert the gene ontology from its original OBO format to a custom JSON format using the JULIE Lab BioPortal tools. The import algorithm expects that format and is not able to use OBO right now. The problem is that the BioPortal tools use the OWL API to parse OBO. While that works in principle, for some reason the OBO property name and xref are both mapped to the OWL property rdfs:label. This results is a mix-up where sometimes an xref is used for the preferred name, leading to wrongly named concepts. The solution for now is to remove the xrefs before parsing. To change the names in an existing database, some small additions to the concept manager and the Neo4j server plugins were necessary. This is reflected in GePI by the usage of a newer concept manager version.