Closed yuye-aws closed 5 months ago
Hi maintainers. This PR is a fix towards text chunking processor. Please attach backport 2.x and backport 2.13 labels to this PR.
This PR is still work in progress. Before getting merged, this PR must satisfy the following conditions :
@model-collapse @zane-neo This PR is ready for review now. Please merge this PR after passing all the CI workflow.
Shall we add an IT to cover this "configured shard number is less than the number of nodes" scenario? Can be done in a separate issue and PR.
Shall we add an IT to cover this "configured shard number is less than the number of nodes" scenario? Can be done in a separate issue and PR.
In current CI all IT are run with one node. I think we can enhance the CI framework by adding the build with -PnumNodes=3
. This can help us exclude bugs in distributed scenerio at early stage.
I think we can enhance the CI framework by adding the build with
-PnumNodes=3
Good point. We can follow the same process like ml-commons.
@zane-neo The current gradle checks get failed due to model deployed issue. Is it attributed to the latest update in ml-commons, like async http client?
We can't merge the PR until bwc tests passes.
@model-collapse GH workflows are failing. Lets ensure GH actions are successful before approving the PRs
Even gradle checks are failing @yuye-aws
"Model not deployed yet" error coming from ml-commons https://github.com/opensearch-project/ml-commons/issues/2382
"Model not deployed yet" error coming from ml-commons opensearch-project/ml-commons#2382
This is another issue that related to ml-commons main branch, we'll track this with a new issue. We'll merge this one for now as it fixes an critical issue that could impact on customers.
The backport to 2.13
failed:
The process '/usr/bin/git' failed with exit code 1
To backport manually, run these commands in your terminal:
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.13 2.13
# Navigate to the new working tree
cd .worktrees/backport-2.13
# Create a new branch
git switch --create backport/backport-713-to-2.13
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2d42408c70e01b95825744bea0182ff361090a4e
# Push it to GitHub
git push --set-upstream origin backport/backport-713-to-2.13
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.13
Then, create a pull request where the base
branch is 2.13
and the compare
/head
branch is backport/backport-713-to-2.13
.
Description
For multi node cluster, the text chunking processor would produce "no such index" error if the configured shard number is less than the number of nodes. This is because some node does not contain the shard information. When we get max token count setting,
indicesService
fails to find the index information.Issues Resolved
Fix ingestion bug on multi-node cluster
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.