datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.93k stars 2.94k forks source link

Ingestion fails due to 413 Client Error: Payload Too Large for url #11904

Open rospe opened 1 day ago

rospe commented 1 day ago

Describe the bug Various custom and Snowflake ingestion runs fail while trying to send data to the GMS server. I think this started with v0.14.1 and I guess it is due to the default sink mode: ASYNC_BATCH that has been activated then. We already set client_max_body_size: "100m" in our nginx config and nginx.ingress.kubernetes.io/proxy-body-size: 200m in the Frontend ingress config.

log message from ingestion (using acryldata/datahub-ingestion:v0.14.1) {'error': 'Unable to emit metadata to DataHub GMS', 'info': {'message': '413 Client Error: Payload Too Large for url: '...

To Reproduce Start our (previously working) Snowflake ingestion using datahub rest as sink with version v0.14.1.

Expected behavior Request size must not exceed our upper size limit.

Additional context Setting the sink config to mode: ASYNC is a workaround.