AllenNeuralDynamics / aind-data-access-api

Library to interface with AIND databases
MIT License
2 stars 0 forks source link

Design public DocDB REST API #47

Closed jtyoung84 closed 3 months ago

jtyoung84 commented 4 months ago

The current public read-only REST endpoint for DocDB cannot handle the volume of data that DocDB serves. This is primarily because of lambda and gateway request size limits.

Define a public, read-only DocDB query interface that supports the following:

  1. Paginated search requests for all DocDB records succeed
  2. API does not add significant (>~5%?) performance overhead compared to querying DocDB directly.
  3. All DocDB query operators are supported.
  4. No authentication is required.
  5. Service autoscales as request load increases.
  6. All requests are logged.
helen-m-lin commented 4 months ago

I benchmarked the current API performance vs using a direct connection (ssh) and analyzed the sizes of records in our docdb. From this, I proposed several performance improvements we can make in our API gateway + Lambda function. Please see Section 5 of this document for details.

After discussion with the team, we will do a POC to enable payload compression in the API gateway and streaming responses in the Lambda function. Additionally, we plan to add error handling, optimize the lambda function, and update the batching/pagination.

jtyoung84 commented 3 months ago

Looks good