argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
3.79k stars 354 forks source link

Feature request: Support AWS OpenSearch serverless as backend #2394

Closed mnschmit closed 1 month ago

mnschmit commented 1 year ago

After a discussion with @frascuchon on Slack, I add my feature request here for better trackability:

AWS recently released a serverless option for their OpenSearch service (which I currently use for my Argilla server). Using the serverless option would be really great for us since it removes the overhead of taking care of a cluster for OpenSearch and you only pay what you use, it scales automatically in peak times etc.

From what I see here, you can just use opensearch-py for interacting with the serverless variant as you would with the server version (see def indexData at the end of the python example). As far as I know, this is how Argilla talks to OpenSearch at the moment anyway.

My only concern is authentication. As far as I know, Argilla only supports basic authentication right now (I had to disable any other access control to make it work with the server version of AWS OpenSearch). The linked example above uses requests_aws4auth.AWS4Auth to provide AWS credentials with the appropriate rights to access/modify the data. Seeing that this seems to be the only piece that is missing and that it seems to be already well supported by the opensearch-py library, it would be great if this was supported by Argilla in the near future.

It looks like specifying AWS credentials in some environment variable for the server and passing that down to the opensearch-py library could already be enough but, of course, I don't know all the details behind the scenes. Thank you for looking into that!

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 90 days with no activity.

mnschmit commented 1 year ago

Hi @frascuchon - should I be worried about the stale label? Is there any update on allowing more flexible opensearch backend solutions? I think it was planned still for this year when we last spoke, no? Is that also the current plan?

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] commented 9 months ago

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 30 days since being marked as stale.