SSHOC / sshoc-marketplace-backend

Code for the backend
Apache License 2.0
2 stars 0 forks source link

Search/get result lists of actors are corrupt #373

Closed KlausIllmayer closed 1 year ago

KlausIllmayer commented 1 year ago

More an unexpected behaviour as a bug (and maybe fixed with #368), but could cause in theory problems:

The default response (without sort) of GET api/actors gives different results based on the perpage value. I would expect that the results should be always delivered back in the same order, but when using a different value for the parameter perpage it shows a different order of actors, with the same name. It is not easy to spot but very irritating (thanks to @cesareconcordia for finding this issue). One example (which could change if the actor-curation is in place):

Calling the actors list with perpage=100 and page=2 should give for the first 20 entries the same results as using perpage=20 and page=6, but when comparing the results there is a difference in position 12 and 13 (starting with 0) of the returned list. The actor Alain Colmerauer has two entries (id=2411 and id=564) and is switched in the result set: if perpage=20 then id=564 is on position 12 and id=2411 is on position 13 whereas with perpage=100 the id=564 is on position 13 and id=2411 on position 12.

It is even getting stranger (and problematic) when same name actors are separated on different pages which can even lead to missing entries: Look for the actor Aracele Torres in this API call: https://marketplace-api.sshopencloud.eu/api/actors/?page=32&perpage=20 - there are three entries with id=3054, id=3094 and id=1127. If you call the perpage in a way, that these three entries are separated over different pages, you will lose the id=1127: here https://marketplace-api.sshopencloud.eu/api/actors/?page=212&perpage=3 you have the id=3054 on position 2 (starting with 0) and on the next page https://marketplace-api.sshopencloud.eu/api/actors/?page=213&perpage=3 you again have id=3054 on position 0 and 3094 on position 1 but no id=1127 on position 2 instead another actor. I would expect on the second page to see id=3094 and id=1127. Which means, that there is no trace of id=1127 if you browse through the actors with perpage=3. [the problem is, that this can change if we have new actors as then the now given urls will give back different results]

Long text short: I guess there is no additional sorting parameter on id next to the name which leads to random results from the database or from Solr (and due to caching this random results stay but can be changed every time the cache is renewed, which makes it not easy to trace). To get expectable result sets the sorting should always have the id as last part of the sort parameters. Looking into the changes of #368 it seems to me, that this is still missing. @tparkola can you add the additional sorting for id for actors and check if whether the same issue also applies to other search endpoints?

KlausIllmayer commented 1 year ago

The issue still persist - I have tried it out on stage:

I have created fake actors so that they appear on the end of the first page. I used the name "Aaron Test" (the name should be always the same to identify the problem more easier - or it maybe only occurs with actors sharing the same name) and created three version of such an actor with three different externalIds so that I can differ it. When I then call for stage GET /api/actors?perpage=17&page=1 I see two actors with externalId=AaronTest2 and externalId=aarontest1 (the third actor has externalId=AaronTest3). If I now go the the second page GET /api/actors?perpage=17&page=2 where I don't see as expected the third actor (having externalId=AaronTest3) but instead see again the actor from page 1 with externalId=aarontest1. It shows everything correct if I choose a page where all actors can be seen, in this case GET /api/actors?perpage=20&page=1. @tparkola You can try it out on stage, but if a new actor is created it may be different due to the pagination.

Sum up: it seems to appear if actors share the same name and they are separated over two pages.

KlausIllmayer commented 1 year ago

Cesare tested it with his script and I also tried it out as described in https://github.com/SSHOC/sshoc-marketplace-backend/issues/373#issuecomment-1477846163 and the problem does not show up again. Therefore closing this issue - thanks for solving!