[FEATURE] Introduce the smoke test for OpenSearch distribution + plugins

reta commented 8 months ago

Is your feature request related to a problem? Please describe

The OpenSearch distribution build has extensive support of running integration tests (integtest.sh) per individual components (mostly plugins at the moment) but not for a distribution with all plugins installed.

Describe the solution you'd like

It would be great to run a limited amount of amount of smoke tests for distribution with all plugins installed. The issue basically came out of https://github.com/opensearch-project/security/pull/4003, and in nutshell is:

some plugins (like security, performance-analyzer) rely on internals of the OpenSearch
some plugins (like telemetry) introduce additional instrumentation / wrapping around these internals
changing or altering these internals could lead (and led in the past) to runtime failures

Having a full distribution and running a few tests to make sure key functional parts work would be sufficient to catch such issues early on.

Describe alternatives you've considered

Add integration tests to each plugin repo separately

Additional context

reta commented 8 months ago

@peternied @Gaganjuneja fyi, @peterzhuamazon could you please add a few hints where to look at for implementing such a feature, thank you.

peterzhuamazon commented 7 months ago

Have a discussion with @reta offline and introduced him on the current build repo test_workflow status.

Send this link: https://github.com/opensearch-project/opensearch-build/tree/main/src/test_workflow

Thanks.

gaiksaya commented 7 months ago

Here is the wiki with some details on the testing: https://github.com/opensearch-project/opensearch-build/wiki/Testing-the-Distribution

@reta Wondering what we are trying to achieve here? Please correct me if I am wrong, do you want to run integration test at the distribution level even in the plugin CI?

reta commented 7 months ago

@reta Wondering what we are trying to achieve here? Please correct me if I am wrong, do you want to run integration test at the distribution level even in the plugin CI?

Thanks @gaiksaya ! No, I was thinking of running integration test at the distribution level with all (or at least most) of the plugins installed. The reasoning here is that we do have a few plugins now that affect OpenSearch core and every other plugin out there (indirectly or not) like security and telemetry, running a few smoke tests against full distribution would help us to catch crosscutting issues (that we run into in the past). Hope it make sense!

peterzhuamazon commented 7 months ago

@reta Wondering what we are trying to achieve here? Please correct me if I am wrong, do you want to run integration test at the distribution level even in the plugin CI?

Thanks @gaiksaya ! No, I was thinking of running integration test at the distribution level with all (or at least most) of the plugins installed. The reasoning here is that we do have a few plugins now that affect OpenSearch core and every other plugin out there (indirectly or not) like security and telemetry, running a few smoke tests against full distribution would help us to catch crosscutting issues (that we run into in the past). Hope it make sense!

So that is pretty much what we are doing right now, but change to specific plugins or test cases, right? We are already running each plugin integTest while deploying full bundled distribution.

reta commented 7 months ago

We are already running each plugin integTest while deploying full bundled distribution.

Correct, but the integTest in this case won't be focused on any specific plugin but distribution, I was thinking about just 1-2 core flows (like index + search fe), to make sure the distribution + plugins are working fine.

gaiksaya commented 7 months ago

The TL;DR of the integration test workflow is:

Install the recently created distribution (min + plugins) into the VM. Clone the plugin repository, checkout a specific commit/branch used to build the above distribution and then run this https://github.com/opensearch-project/opensearch-build/blob/main/scripts/default/integtest.sh#L105

I believe you want the same for OpenSearch too? (min+plugins) and then run some gradle command for integTest of OpenSearch ?

Also heads-up we do not have any integration test for the distribution ONLY as such at this point. We do have validation that makes sure basic APIs like health and plugin list runs, but as such no integration tests for "OpenSearch as a Distribution"

reta commented 7 months ago

I believe you want the same for OpenSearch too? (min+plugins) and then run some gradle command for integTest of OpenSearch ?

I think yes, I will look shortly but is sounds about right: install min, install plugins, run 1-2 tests

Also heads-up we do not have any integration test for the distribution ONLY as such at this point. We do have validation that makes sure basic APIs like health and plugin list runs, but as such no integration tests for "OpenSearch as a Distribution"

Correct, this is why this issue was created

gaiksaya commented 7 months ago

Got it now! Thank you for being patient and helping us understand. In that case,

Would these tests be added in OpenSearch repo or build repo? - Requires considerable effort
I believe we had received a feedback a while back to run gradle check at the distribution level as well (min + plugins instead of just min that runs at PR levels) - just the onboarding so minimum effort required.

reta commented 7 months ago

Would these tests be added in OpenSearch repo or build repo? - Requires considerable effort

I was thinking about build repo, may be we could simplify it by using curl only (no need for Gradle etc), would the effort be justifiable?

I believe we had received a feedback a while back to run gradle check at the distribution level as well (min + plugins instead of just min that runs at PR levels) - just the onboarding so minimum effort required.

I think if we want keep tests in OpenSearch repo (not build repo), we could repurpose some existing tests and run them as run gradle check at the distribution level

gaiksaya commented 7 months ago

I was thinking about build repo, may be we could simplify it by using curl only (no need for Gradle etc), would the effort be justifiable?

I believe so. We can either expand the validation workflow API test cases here or add new framework for integration testing at the distribution level separately.

I think if we want keep tests in OpenSearch repo (not build repo), we could repurpose some existing tests and run them as run gradle check at the distribution level

Right! I believe this is another problem we are trying to solve. Onboarding OpenSearch to test workflow can be another problem statement. I assumed if this repurposing was to add the integration tests at distribution level then it would be beneficial to get everything in one go.

Adding @prudhvigodithi @bbarani to this conversation to provide some insights and let us know if any parallel efforts are going on.

reta commented 7 months ago

I believe so. We can either expand the validation workflow API test cases here or add new framework for integration testing at the distribution level separately.

👍 will look into it

. Onboarding OpenSearch to test workflow can be another problem statement. I assumed if this repurposing was to add the integration tests at distribution level then it would be beneficial to get everything in one go.

👍 will look what we have now (we have a lot of test types and phases)

prudhvigodithi commented 7 months ago

Reading the above comments what we can do is.

Extend the validation framework (as mentioned by @gaiksaya) to run one query/API/test per plugin (which can be extended as required in future) on the distribution.
This can be something like get/update the security settings, run the security admin script and test the security index.
Test the plugin behaviors on the distribution.
Index the data and run some queries (one query that can test each plugin functionality), test basic aggregations if required.
Have a common framework for other new plugins to onboard to these global distribution level tests.
This should be common for OpenSearch and OpenSearch Dashboards.
Dashboard API operations to create and test visualizations.

If we dont want this in build repo we can even have a new repo for this distribution level testing, something like https://github.com/opensearch-project/opensearch-dashboards-functional-test which should be common for both OpenSearch and OpenSearch Dashboards.

@reta is this something you are looking for ? :)

reta commented 7 months ago

@reta is this something you are looking for ? :)

Thanks @prudhvigodithi , I think your comment incorporates possible directions for improving / evolving the validation part of the build, I currently have no knowledge of it (but the hints where to look, thanks to @gaiksaya ). At high level - I am looking for very specific phase at the validation step when we could:

install min distribution
install all bundled plugins
run a few smoke tests

The last step is not specific to any plugin but a distribution as a cohesive bundle.

gaiksaya commented 7 months ago

We already do 1-2 so only part we need to add is run a few smoke tests Also @prudhvigodithi, IMO keeping it simple (only related to distribution) might be helpful. If plugin specific tests come in we would be looking at another functional-test repo and mismatched tests. As @reta said those tests should be related to distribution as a cohesive bundle. Core maintainers, community as well as contributors can contribute or let us know what would come under this test umbrella.

seraphjiang commented 7 months ago

If we dont want this in build repo we can even have a new repo for this distribution level testing, something like https://github.com/opensearch-project/opensearch-dashboards-functional-test which should be common for both OpenSearch and OpenSearch Dashboards.

functional test repo is design for smoke test purpose (or we named it release test) for dashboards and dashboards plugin :)

zelinh commented 7 months ago

[Grooming] Acceptance Criteria:

[ ] Create a one pager differentiating smoke tests and validation.
[ ] Find out the location of these smoke tests. (What specific repos)
[ ] Tests should be generated for OS. OSD is pending decision from OSD core team.

dblock commented 2 months ago

You can use https://github.com/opensearch-project/opensearch-api-specification!

zelinh commented 1 month ago

We plan to create a new test workflow framework for smoke tests.

The main uncertainty we have now is how we define smoke tests.

Here are some of Scope and Requirements we were thinking:

Target Distribution:
- Smoke tests will firstly apply to the OpenSearch distribution bundle. OpenSearch Dashboards can be onboarded later with the same framework.
Test Environment:
- Smoke tests will require the OpenSearch distribution bundle to be installed with all plugins and the cluster to be spun up.
- The tests will assume a with-security configuration for all cluster operations if security plugin is built within the bundle.
Test Coverage:
- Basic smoke tests will cover essential functionalities such as index operations and API requests.
- Tests will be run using simple curl commands to make API requests and validate responses.
Cluster Configuration:
- The cluster setup will use default configurations, and no cleanup will be performed after tests.
Concurrency:
- All smoke tests should be able to run concurrently, with no dependency on a specific sequence of execution.
Limitations:
- Tests that require extensive data validation or cleanup are outside the scope of smoke testing and should be included in integration tests.

In this way, we would be focusing on testing the basic functionalities including API status verification for OpenSearch core and plugins with a distribution bundle spun up.

What do you think about any of specific smoke tests (any API requests) from Core perspective that we could start on? @reta

reta commented 1 month ago

Thanks @zelinh , it makes perfect sense

What do you think about any of specific smoke tests (any API requests) from Core perspective that we could start on? @reta

I think as per comment:

basic smoke tests will cover essential functionalities such as index operations and API requests.

That would be great start. I sadly don't know if we already have such tests (as part of plugin suites) in some form or another, but that was an idea: install all plugins, ingest few documents, run search. I think we don't need many tests. Does it answer your question?

peterzhuamazon commented 1 month ago

Thanks @zelinh , it makes perfect sense

What do you think about any of specific smoke tests (any API requests) from Core perspective that we could start on? @reta

I think as per comment:

basic smoke tests will cover essential functionalities such as index operations and API requests.

That would be great start. I sadly don't know if we already have such tests (as part of plugin suites) in some form or another, but that was an idea: install all plugins, ingest few documents, run search. I think we don't need many tests. Does it answer your question?

In our validation workflow the most we did was to call 9200/5601 port, list plugin, check cluster status and that is it. I assume we need to call more APIs as a baseline to see if the cluster is good.

reta commented 1 month ago

In our validation workflow the most we did was to call 9200/5601 port, list plugin, check cluster status and that is it. I assume we need to call more APIs as a baseline to see if the cluster is good.

Correct, ingest + search for ingested docs

zelinh commented 1 month ago

In our validation workflow the most we did was to call 9200/5601 port, list plugin, check cluster status and that is it. I assume we need to call more APIs as a baseline to see if the cluster is good.

Correct, ingest + search for ingested docs

Make sense for me. We could start with ingesting few docs and run some basic search on indices. The framework we design could also be elaborated with time.

zelinh commented 1 month ago

We proposed smoke test workflow provides a foundational framework to run quick checks on the OpenSearch bundle. These tests run separately from the more complex integration tests but follow a similar deployment process.

Introduce a smoke test framework to the CI/CD process. The workflow will pass parameters like the distribution path, architecture, and components. st2

Tasks:

https://github.com/opensearch-project/opensearch-build/issues/5126 A new "smoke-test" section will be added to the manifest file to specify which components are onboarded for smoke testing.
Onboard component test spec YAML and spec framework Each plugin will have a YAML file defining the smoke tests and associated API requests based on the OpenSearch API specification. The test specs will account for version differences in APIs.
https://github.com/opensearch-project/opensearch-build/issues/5164 The new introduced smoke test runner automates OpenSearch cluster deployment, loads test manifests, sends API requests, and validates responses against the OpenSearch API specification with third-party OpenAPI-core python client. It ensures all key components function immediately after a build.
Integrate the smoke test workflow with CI/CD pipeline The new Jenkins pipeline would be created to integrate smoke tests with existing build workflow for automation to catch major issue in earlier stage.

reta commented 3 weeks ago

Thanks for the update @zelinh

  Each plugin will have a YAML file defining the smoke tests and associated API requests based on the OpenSearch API specification. The test specs will account for version differences in APIs.

I think in general it makes sense but we very likely don't need that at first: the initial smoke test to run, as we discussed, has a goal of making sure basic ingestion + search work with all plugins installed. This flow (again, in basic form) does not directly depend on any plugin.

opensearch-project / opensearch-build