opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.01k stars 1.67k forks source link

[RFC] OpenSearch Data Format #8639

Open penghuo opened 11 months ago

penghuo commented 11 months ago

User Pain point

Proposed Solution

image

Technical Challenge

penghuo commented 11 months ago

Demo setup

Screenshot 2023-06-20 at 7 21 23 AM

dai-chen commented 11 months ago

Here is the demo video that covers the following topic:

  1. OpenSearch Data Format proposed in this issue that remove hard dependency on OpenSearch cluster and separate read and write path
  2. Virtual / External Index that makes data set on object store accessible to OpenSearch. Please find more details in https://github.com/opensearch-project/sql/issues/1080
  3. Skipping Index that avoids unnecessary shard load and scan. Please find more details in https://github.com/opensearch-project/opensearch-spark/issues/2

https://github.com/opensearch-project/OpenSearch/assets/46505291/b2b71f27-3f55-4f31-815c-ede2df6e5aa2

schenksj commented 11 months ago

this is very cool! has any progress been made on the spark sql execution datasources side?