Open zcmander opened 1 month ago
area:catalog
would be better label for this.
Initial fix done (thx @drodil), maybe the best to leave this open as it's only improvement to current situation and does not fully fix the issue as it needs more time and effort to do it properly at database level.
📜 Description
Recently we enabled paginated catalog view in our instance of Backstage. This changes the view to fetch entities from catalog using
queryEntities
call instead of battle-testedgetEntities
that does not support pagination at all.However, soon after deploying the change to production we notices that the catalog returns duplicate entities in responses. It was weird as those entities does have the same
uid
in it so we are certain that those are not duplicate in our database but rather somekind of logic error in queries or how data is fetched.The
queryEntities
fetches data by joining two tables:refresh_state
andsearch
table. I did check that therefresh_state
did not contain any duplicate data but thesearch
table does.You can check out your own database if this is the case by running:
I'm suspecting that the following change is the reason why we stated seeing this unexpected result:
https://github.com/backstage/backstage/commit/53cce8663178091f8ff46c40a48db4e06ebf9159#diff-7841edf78cd947efd000b8b63a59d6bfbe1bb55c20e76fb810b2fedb33211aa2
As it joins
search
table in a way that it could produce duplicate entities (before the query was constructed in a way it would hide duplicates if there's any).How this can be fixed??
I'm not too familiar with the catalog codebase but couple of things comes to my mind:
entity_id
,key
) to prevent duplicatesqueryEntitites
endpoint?👍 Expected behavior
There should be no duplicates when calling queryEntities.
👎 Actual Behavior with Screenshots
This is a bug in API but visible if paginated catalog is enabled (as it is by default)
👟 Reproduction steps
Not sure how, I suspect that the root cause for the bug is somekind of race-condition and that's why running multiple nodes is required.
📃 Provide the context for the Bug.
🖥️ Your Environment
No response
👀 Have you spent some time to check if this bug has been raised before?
🏢 Have you read the Code of Conduct?
Are you willing to submit PR?
No, but I'm happy to collaborate on a PR with someone else