EBISPOT / ols4

Version 4 of the EMBL-EBI Ontology Lookup Service (OLS)
http://www.ebi.ac.uk/ols4/
Apache License 2.0
35 stars 15 forks source link

Label type inconsistency in V2 #649

Open Pooya-Oladazimi opened 3 months ago

Pooya-Oladazimi commented 3 months ago

Describe the bug There is an inconsistency with the term label metadata type. Sometimes it is a String and sometimes it is a list.

To Reproduce https://www.ebi.ac.uk/ols4/api/v2/ontologies/agro/classes?hasDirectParent=false&size=1000&lang=en&includeObsoleteEntities=false

The term "entity" has a label as a list but the rest of them have it as a string.

Expected behavior It is not a bug per se. The client can check the type on its own. However, I would say sticking to one type would be better.

Additional context Maybe related to: https://github.com/EBISPOT/ols4/issues/474

rombaum commented 3 months ago

It could be related to this topic:

https://github.com/EBISPOT/ols4/issues/474

haideriqbal commented 3 months ago

Hi @Pooya-Oladazimi ! i've just checked and the system is working as expected. The reason for top level label showing as a list rather than a string is that in agro ontology owl file, the label is defined twice which is not correct and should be corrected on agro side.

Screenshot 2024-04-18 at 14 42 28

In the case of the entity label being a string in the linked_entities object, it is because of the way it is handled within our code.

Also, just a note, I noticed you referred to API v2 in your issue, so I wanted to mention that we use v2 API for our internal development and it is subject to change so we don't encourage users to base their pipeline on v2 API.

Pooya-Oladazimi commented 3 months ago

@haideriqbal

Thanks for the response.

Well in this example, yes, it is the ontology issue. But the same issue exists for other fields. For example, the metadata "directParent" is sometimes a list and sometimes a string.

https://www.ebi.ac.uk/ols4/api/v2/ontologies/agro/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FBFO_0000002

https://www.ebi.ac.uk/ols4/api/v2/ontologies/agro/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FTO_0000599

Or the Boolean metadata is sometimes string and sometimes not (true vs "true"). This creates many headaches for API users and clients.

Again, these types can be checked on the client side, but it would be nice if the response structure and metadata field type follow a specific standard and consistency.

Yes, I am aware of ongoing development for V2. Our pipeline is still on OLS3 but I am gradually changing it to OLS4 as it has many great features. That is the reason I brought this issue up so that it might be helpful.

haideriqbal commented 2 months ago

Hi @Pooya-Oladazimi!

As I mentioned earlier the v2 API you are referring to is just for internal OLS development atm and it's not intended for users to base their pipelines on. You will find a lot of inconsistencies like the one you mentioned in the v2 API responses. Also, the responses to these API calls are subject to change so if you end up basing your pipeline on v2 API there'll be a lot more problems down the line for you.

All OLS users should base their pipelines on the original OLS API as that API has consistent responses. For instance, in your example, I would use https://www.ebi.ac.uk/ols4/api/ontologies/agro/terms?iri=http://purl.obolibrary.org/obo/TO_0000599 rather than the v2 API call and extract the relevant information from their onwards such as to get ancestors of the term I'll go to https://www.ebi.ac.uk/ols4/api/ontologies/agro/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FTO_0000599/hierarchicalAncestors. I hope this helps.

henrietteharmse commented 4 weeks ago

@Pooya-Oladazimi If you come across similar inconsistencies, please feel free to add them here. I am looking into this currently and hope to catch all similar issues. However, it never hurts to double check.

rombaum commented 4 weeks ago

@henrietteharmse are you also interested in different keys? We found for example a small inconsistence in a response key of the search API between OLS3 and OLS4.

Differences on OLS3/OLS4 API facet keys:

OLS3: response["facet_counts"]["facet_fields"]["ontology_name"] OLS4: response["facet_counts"]["facet_fields"]["ontologyId"] see: https://github.com/ts4nfdi/terminology-service-suite/pull/94#issue-2363710240

@jusa3 mentioned to me she also found similar changes next to this one. So if you are interested also in these maybe she could list some here as well.

henrietteharmse commented 4 weeks ago

@rombaum The v2 API is intentionally not consistent with the v1. However, inconsistencies within the v2 API, irrespective of the key, will be useful sanity check information.

Briefly on ontology_name vs ontologyId. In v2 we prefer camel case keys. As for ontologyId rather that ontologyName, we thought ontologyId is more appropriate because it needs to be unique within OLS.