Closed eliarizzetto closed 1 month ago
I have resolved the problem with the ordering of roles in the OC Meta API results. The fix ensures that the authors (editors and publishers) are now correctly ordered according to the oco:hasNext property in the triplestore.
Previously, I was capturing the order by sorting the roles in descending order based on the number of oco:hasNext edges and then aggregating the results. This approach worked correctly with Blazegraph. However, after migrating to Virtuoso, this method stopped working as expected, and I couldn't determine the exact cause of the discrepancy.
To address this, I've shifted the ordering logic from the SPARQL query to Python. The solution now involves modifying the SPARQL query to capture the full chain of oco:hasNext relationships and updating the Python code to process this information correctly. This change allows us to accurately reconstruct the intended author order, regardless of the underlying triplestore implementation.
The fix has been implemented and is now live. You can find the details of the implementation in this commit: https://github.com/opencitations/api/commit/57ce162597a96e396c70e8cd604e1f50b9161a66
In the results of the
metadata
operation of OC Meta API, the order of the entities exposed in theauthor
field does not match the correct order of the authors specified in the triplestore.For example, the call https://opencitations.net/meta/api/v1/metadata/omid:br/0680773548 returns the authors of br/0680773548 in the following order: Bilgin, Hülya [orcid:0000-0001-6639-5533 omid:ra/0622032021]; Bozkurt, Merlin [omid:ra/06802276621]; Korfali, Gülsen [omid:ra/06802276623]; Yilmazlar, Selçuk [omid:ra/06802276622].
The authors of this resource are stored in the triplestore in a different order (specified by the
oco:hasNext
property): Bilgin, Bozkurt, Korfali, Yilmazlar (the positions of the last two authors is inverted in the API's result).Examples like the one of br/0680773548 can be reproduced following the procedure below.
First, we use the SPARQL endpoint to retrieve 20 sample BRs that have more than 4 authors, to be able to significantly compare the order of the authors in the triplestore and the API results.
We get the following result:
Then we pick any of the BRs in the result (in this instance, the first one, br/0680773548) and retrieve via SPARQL endpoint the details about its authors: the OMID of the agent role; the OMID of the responsible agent; the surname of the agent; and the object of the
oco:hasNext
property, which determines the order of the authors, or rather of their roles.We obtain the following result:
We query the Meta REST API for the same BR as step 3 (br/0680773548): https://opencitations.net/meta/api/v1/metadata/omid:br/0680773548, getting this result:
As we can observe comparing the result of the SPARQL endpoint and the one of the API, the order of the authors differs: in particular, the positions of the last two authors are inverted (Korfali, i.e. ra/06802276623, should be the last one, as its role is not linked to any other role by the
oco:hasNext property
, and should be preceded by Yilmazlar, i.e. ra/06802276622, since there is a triple specifying thatar/06803250815 oco:hasNext ar/06803250816
).