Closed guentherhackl-wgs closed 1 month ago
We'll get a PR up to improve the performance of the graphql call. There has been at least one possible mitigation to improve performance in this case to reduce redundant access checks in #11327 part of v0.14.1
.
https://github.com/datahub-project/datahub/pull/11434 should hopefully fix this issue.
Describe the bug The symptom of the issue was that the schema of a kafka topic would not load and after ~30s it dispayed "No Data" for the schema although the schemaMetadataAspect was visible in the database.
In the GMS Logs I could see that the graphQL query to retrieve the schemaMetadata caused an async timeout (and still finished after 200s)
No resource restrictions showed up on any graph in our graphana. There were seemingly enough cpu and ram for GMS and OpenSearch. THe times for the other request don't look that bad to me.
I've gone through the graphql query and deleted parts to narrow down the issue and was able to find the following minimal query to produce it in my setup, which indicated the involvement of the SchemaFieldEntities:
After setting the feature flag for SchemaFieldEntities to false (SCHEMA_FIELD_ENTITY_FETCH_ENABLED=false) it significantly improved.
There seem to be two factors which have an effect on this:
To Reproduce Just loading the affected large schema to a local datahub did not produce the issue. The number of entities seem to be important to. I have not yet tried it, but I would assume to reproduce a significant amount of datasets is needed locally
Expected behavior Show the schema when it is verifiably in the database
Desktop (please complete the following information):