raystack / compass

Compass is an enterprise data catalog that makes it easy to find, understand, and govern data.
https://compass-raystack.vercel.app/
Apache License 2.0
63 stars 7 forks source link

Optimise handling high throughput requests #163

Open StewartJingga opened 1 year ago

StewartJingga commented 1 year ago

Is your feature request related to a problem? Please describe. In our setup, we have Meteor jobs that are running every 10 mins.

On failure sending metadata to Compass, the requests were being retried thus causing throughput to 600req/s from usual which is 200req/s.

This causes Compass' Postgres CPU utils to 100% thus causing other incoming requests to be affected and dropped.

Describe the solution you'd like

1. Bulk ingestions

Right now Compass UpsertPatch api is only allowing a single asset per request. Doing ingestion in Bulk would help reducing overhead on at least:

  1. Network calls
  2. Open connections on Postgres (yes, bulk insert on postgres too)

This could potentially reduce the load on Postgres due to less connections to maintain and network calls on and from Compass.

2. Rate Limiting APIs

Ingestions are mostly carried out with high throughput. Rate limiting might not be the solution here, but maybe it could help preventing unnecessary (downstream) calls (postgres, elasticsearch) that could potentially block the process.

Describe alternatives you've considered Current reducing throughput from Meteor works, but it is not scalable and would be better if we solve it from Compass level.

StewartJingga commented 1 year ago

@rohilsurana @AkarshSatija @ravisuhag @mabdh

rohilsurana commented 1 year ago

Should we also look into configuring max conns when connecting to postgres? This would limit concurrent queries on postgres and make sure that atleast some get processed as usual. Other requests would have to wait and could also get timed out and can then be retried by the client.