Use AbstractBatchingProcessor for InferenceProcessor

chishui commented 6 days ago

Description

Based on discussion, it's strongly recommended to move sub batching logic from _bulk API to each processor.

So for two batch supporting processors: text_embedding, sparse_encoding, based on discussion, we make them inherit from a newly introduced AbstractBatchingProcessing so that these two processors supports a new optional parameter batch_size and this parameter can control the cutting sub batches logic. The default of this parameter is 1 to be consistent with existing behavior.

Add more integration tests.

Issues Resolved

https://github.com/opensearch-project/OpenSearch/issues/14283

Check List

[x] New functionality includes testing.
- [x] All tests pass
[x] New functionality has been documented.
- [x] New functionality has javadoc added
[x] Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

zane-neo commented 12 hours ago

@chishui Please take a look on the conflict.

chishui commented 9 hours ago

@chishui Please take a look on the conflict.

conflicts have been resolved

opensearch-project / neural-search