Open ylwu-amzn opened 1 year ago
The bug was in ml-commons. Ideally, test should be in ml-common imo. We don't expect every ml-common dependency plugin to add reindex integ test to catch a bug in ml-common?
It will be challenging for ml-commons , as run reindex needs Knn and neural-search plugins
The bug happens when running reindex with neural search ingest pipeline.
Understand the difficulty to have reindex test in ml-commons but in a long term, we should consider implementing some very basic ingest pipeline inside ml-commons for integ test purpose.
@heemin32 I would beg to differ little bit here. As the ingest processor is defined in neural search plugin, ideally it is the job of the neural search plugin to validate all the cases in which that processor can be run. Now running the processor on re-index is another way to calling that processor. Hence the integration tests should be in neural search for re-index, because ML Commons cannot cover all the cases in which how the processor can be invoked.
Having said all the things above, one thing we definitely need from ML Commons Plugin or MLCommons client side, in terms of security how the apis can be used. This will ensure that its not the clients who are letting MLCommons know that something is broken.
Having said all the things above, one thing we definitely need from ML Commons Plugin or MLCommons client side, in terms of security how the apis can be used. This will ensure that its not the clients who are letting MLCommons know that something is broken.
My point align with this comment. MLCommons should be able to find if something is broken without relying on integ test in other plugins. Neural search should have re-index integ test to increase test coverage but MLCommons also should have a test case to cover the bug regarding this PR https://github.com/opensearch-project/ml-commons/pull/1418.
@heemin32
implementing some very basic ingest pipeline inside ml-commons for integ test purpose.
My comment was mainly around this.
@heemin32
implementing some very basic ingest pipeline inside ml-commons for integ test purpose.
My comment was mainly around this.
Just one of option to catch the bug inside ml-common itself.
We found some bug when reindex a non-KNN index to KNN index with neural-search ingest pipeline. We fixed this issue https://github.com/opensearch-project/ml-commons/pull/1418.
It will be best if neural-search can have some integration test to cover reindex. So we can catch such issue earlier.