hasura / graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
https://hasura.io
Apache License 2.0
31.17k stars 2.76k forks source link

Uploading metadata to hdb_catalog.hdb_metadata instead of using API #10315

Open JCatrielLopez opened 4 months ago

JCatrielLopez commented 4 months ago

Version Information

Server Version: v2.37.1

Environment

OSS

What is the current behaviour?

When uploading metadata and reloading it through the Hasura API, the process is slow.

Question

Could you provide insights on whether directly updating the hdb_catalog.hdb_metadata table and triggering a metadata reload is a recommended and safe practice? If not, are there alternative approaches to improve the performance of metadata upload and reload operations?

manasag commented 3 months ago

The metadata operations are resource intensive since Hasura needs to compile the changes into a memory cached schema. This schema powers the realtime SQL generation (the uploading of metadata affect is negligible). Can you provide more details to understand your performance issues? Size of your metadata (Number of tables, roles, permissions etc.) and CPU/Memory characteristics of your Hasura instance.

JCatrielLopez commented 3 months ago

Around 5k tables, with 11 roles defined.

Current usage: CPU MEM 1057m 10705Mi

Running as a pod with no resource limits defined, on a node with this allocatable resources:

kubectl get node <node> -o json | jq '.status.allocatable'
{
  "cpu": "19",
  "ephemeral-storage": "150033204516",
  "hugepages-2Mi": "0",
  "memory": "112241968Ki",
  "pods": "300"
}

The metadata operations are resource intensive since Hasura needs to compile the changes into a memory cached schema.

Doesn't the reload metadata compile this changes after upload? Or does uploading the metadata already does this and a reload is not necessary? Just to clarify, we:

  1. Generate a new metadata file
  2. Upload the new metadata through the upload_metada endpoint
  3. Reload the metadata to generate the new schema

and the change I'm interested in would be:

  1. Generate the new metadata file
  2. Upload the metadata directly to database
  3. Reload the metadata to generate the new schema

So far, we haven't found any issues with this method, and its faster (from 250 seconds, more or less, down to 30)