Open reta opened 3 months ago
The schema actively relies on google.protobuf.Any to pass freestyle JSON-like structures around (for example, documents or scripts):
I've seen two other options used to pass around documents when using Protobuf in search use-cases:
Document
(where Document
is the type with the fields), or you could have a separate k-v list for nested documents. (To be fair, I think I've only seen the separate list when nested objects were added later.)I'm still not sure what solution I'd like to see, but wanted to document those options.
Is your feature request related to a problem? Please describe
Is your feature request related to a problem? Please describe. The
bulk
HTTP API does not support streaming (neither HTTP/2 nor chunked transfer)Describe the solution you'd like Introduce bulk Protobuf API streaming flavour (see please https://github.com/opensearch-project/OpenSearch/issues/9070#issuecomment-2307452157) based on new experimental transport (https://github.com/opensearch-project/OpenSearch/issues/9067)
Describe alternatives you've considered N/A
Additional context See please https://github.com/opensearch-project/OpenSearch/issues/9067
Introduce efficient (binary?) format for streaming ingestion
Alternative option (to https://github.com/opensearch-project/OpenSearch/issues/9070) is to introduce new efficient (binary?) format for streaming ingestion (for example, based on Protocol Buffers).
The example message schema may look like this:
The schema actively relies on
google.protobuf.Any
to pass freestyle JSON-like structures around (for example, documents or scripts):Risks to consider:
Related component
Indexing
Describe alternatives you've considered
Stay on HTTP APIs only (https://github.com/opensearch-project/OpenSearch/issues/9070)
Additional context
See please https://github.com/opensearch-project/OpenSearch/issues/9067