Open stephen-crawford opened 1 year ago
I am going to go ahead and get this ball rolling.
Some things I would be interested in seeing are:
New tests auto-run: Keep the existing gradle check, but add a workflow that automatically scanned for new test files (gradle + GitHub should be able to do this) and then runs those tests separately. This would let you know whether there was an issue with the test you just wrote or something related to how the tests interact with the older test suite. @cwperks wrote a file that does something along these lines over in the Identity project, and it has worked well for us. I think taking Craig's idea and going one step further could be helpful for a lot of repos.
Dashboards auto-push to functional repo: I have heard from some dashboards-minded folks that the functional tests repo does not always get individual repos' tests. I am not much of a dashboards expert myself so do not know what is required here, but it sounds like adding a check to make sure that all tests are added to the functional testing repo would be helpful.
Documentation check: Something else that I heard was that there are some challenges keeping documentation relevant and up-to-date. I think it could be helpful to add an auto-check like there is for the CHANGELOG
that checked for documentation updates or required you to override it stating that your change only modified code irrelevant to feature execution. I.e. you updated a dependency or added an extra test.
To help establish guidelines, I am going to start going through each repository and reviewing their testing framework and confirming their current requirements. I will then keep a running document with each of the repos' requirements listed. From there, I will synthesize a set of requirements that should be established.
Repository | Unit Tests | Integration Tests | Backwards Compatibility Tests | Additional Tests | Link |
---|---|---|---|---|---|
Alerting | Security Test Workflow, Multi-node Test Workflow, Certificate of Origin | https://github.com/opensearch-project/alerting/issues/803 | |||
Alerting Dashboards | Create Documentation Issue, Certificate of Origin | https://github.com/opensearch-project/alerting-dashboards-plugin/issues/491 | |||
Anomaly Detection | Link checker, AD benchmark, Certificate of Origin | https://github.com/opensearch-project/anomaly-detection/issues/814 | |||
Anomaly Detection Dashboards | Link checker, AD benchmark, PR Labeler, Certificate of Origin | https://github.com/opensearch-project/anomaly-detection-dashboards-plugin/issues/428 | |||
Ansible Playbook | https://github.com/opensearch-project/ansible-playbook/issues/117 | ||||
Asynchronous Search | Certificate or Origin, Multi-node test | https://github.com/opensearch-project/asynchronous-search/issues/243 | |||
Common Utils | Certificate or Origin | https://github.com/opensearch-project/common-utils/issues/368 | |||
Cross Cluster Replication | Certificate or Origin, Security Tests | https://github.com/opensearch-project/cross-cluster-replication/issues/721 | |||
Dashboards Anywhere | Health Check, Deployment Tests, Functional Tests | https://github.com/opensearch-project/dashboards-anywhere/issues/153 | |||
~Dashboards Desktop~ | https://github.com/opensearch-project/dashboards-desktop/issues/37 | ||||
Dashboards Maps | https://github.com/opensearch-project/dashboards-maps/issues/262 | ||||
Dashboards Notifications | Lint Checker, Documentation Auto-cut | https://github.com/opensearch-project/dashboards-notifications/issues/17 | |||
Dashboards Observability | Link Checker, Certificate of Origin | https://github.com/opensearch-project/dashboards-observability/issues/289 | |||
Dashboards Query Workbench | Code QL, Lint Checker, Certificate of Origin, SQL Query Workbench | https://github.com/opensearch-project/dashboards-query-workbench/issues/45 | |||
Dashboards Reporting | Certificate or Origin, Reports Checker | https://github.com/opensearch-project/dashboards-reporting/issues/65 | |||
Dashboards Search Relevance | Certificate or Origin, Changelog Verifier, Lint Checker | https://github.com/opensearch-project/dashboards-search-relevance/issues/152 | |||
Dashboards Visualizations | Certificate or Origin, Link Checker | https://github.com/opensearch-project/dashboards-visualizations/issues/167 | |||
Data Prepper | Certificate of Origin, Create Document Issue, Performance Tests Compile, App Check, Trace Analytics Tests | https://github.com/opensearch-project/data-prepper/issues/2302 | |||
Geospatial | Certificate of Origin, Link Checker | https://github.com/opensearch-project/geospatial/issues/227 | |||
Helm Charts | Lint Checker | https://github.com/opensearch-project/helm-charts/issues/3858 | |||
Index Management | Certificate of Origin, Link Checker, Lint Checker, Security Test, Docker Security Test, Multi-node Test | https://github.com/opensearch-project/index-management/issues/698 | |||
Index Management Dashboards Plugin | Link Checker, Certificate of Origin | https://github.com/opensearch-project/index-management-dashboards-plugin/issues/629 | |||
Job Scheduler | Certificate of Origin, Link Checker | https://github.com/opensearch-project/job-scheduler/issues/327 | |||
k-NN | Certificate of Origin, Link Checker, Documentation Request, Benchmarking Tool, Custom Benhcmarking Tool | https://github.com/opensearch-project/k-NN/issues/775 | |||
ML-Commons | Certificate of Origin, Documentation Request | https://github.com/opensearch-project/ml-commons/issues/752 | |||
ML-Commons Dashboards | Lint Checker | https://github.com/opensearch-project/ml-commons-dashboards/issues/127 | |||
Neural Search | Certificate of Origin, Link Checker, Benchmarking Tool (in progress) | https://github.com/opensearch-project/neural-search/issues/124 | |||
Notifications | Link checker | https://github.com/opensearch-project/notifications/issues/622 | |||
Observability | Link checker, Certificate of Origin, Enforce PR Labels | https://github.com/opensearch-project/observability/issues/1416 | |||
OpenSearch | Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Create Documentation Issue, ShellCheck, Validate Gradle Wrapper | https://github.com/opensearch-project/OpenSearch/issues/6389 | |||
OpenSearch Benchmark | https://github.com/opensearch-project/opensearch-benchmark/issues/222 | ||||
OpenSearch Build | Certificate of Origin, Link Checker, License Header, Groovy Tests, Yaml Lint, Python Tests | https://github.com/opensearch-project/opensearch-build/issues/3241 | |||
OpenSearch CI | https://github.com/opensearch-project/opensearch-ci/issues/254 | ||||
OpenSearch CLI | https://github.com/opensearch-project/opensearch-cli/issues/75 | ||||
OpenSearch Dashboards | Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Document Link Checker | https://github.com/opensearch-project/OpenSearch-Dashboards/issues/3466 | |||
OpenSearch Dashboards Functional Test | Link Checker, Lint, Bundled Tests, tests added from other repositories | https://github.com/opensearch-project/opensearch-dashboards-functional-test/issues/546 | |||
~OpenSearch DSL-PY~ | Changelog Verifier, Link Checker | https://github.com/opensearch-project/opensearch-dsl-py/issues/101 | |||
OpenSearch Go | License Headers, Changelog Verifier, Link Checker, Lint Checker | https://github.com/opensearch-project/opensearch-go/issues/238 | |||
OpenSearch Hadoop | Certificate of Origin, Link Checker, Changelog Verifier | https://github.com/opensearch-project/opensearch-hadoop/issues/123 | |||
OpenSearch Java | Certificate of Origin, Link Checker, Lint Checker, Checkstyle | https://github.com/opensearch-project/opensearch-java/issues/378 | |||
OpenSearch-JS | Link Checker, License, License Header, Changelog Verifier | https://github.com/opensearch-project/opensearch-js/issues/390 | |||
OpenSearch Net | Certificate of Origin, Link Checker, Deploy Documentation, License Headers, Changelog Verifier | https://github.com/opensearch-project/opensearch-net/issues/157 | |||
OpenSearch PHP | Link Checker, Changelog Verifier, Update Documentation | https://github.com/opensearch-project/opensearch-php/issues/127 | |||
OpenSearch Py | Certificate of Origin, Link Checker, License Headers, Changelog, Deploy Doc | https://github.com/opensearch-project/opensearch-py/issues/304 | |||
OpenSearch Py-ML | Certificate of Origin, Deploy Doc | https://github.com/opensearch-project/opensearch-py-ml/issues/88 | |||
OpenSearch-rs | Changelog verifier, Link Checker, Clippy Check, Certificate of Origin | https://github.com/opensearch-project/opensearch-rs/issues/127 | |||
OpenSearch Ruby | Link Checker, License Headers, Publish Docs | https://github.com/opensearch-project/opensearch-ruby/issues/148 | |||
Opensearch-SDK-Java | Validate Gradle Wrapper | https://github.com/opensearch-project/opensearch-sdk-java/issues/470 | |||
OUI | Certificate Checker, Lint checker | https://github.com/opensearch-project/oui/issues/331 | |||
Performance Analyzer | Task List Checker, Link Checker, Certificate of Origin, Create Doc Issue | https://github.com/opensearch-project/performance-analyzer/issues/391 | |||
Performance Analyzer RCA | Link Checker, Gauntlets Test, Certificate of Origin, Create Doc Issue | https://github.com/opensearch-project/performance-analyzer-rca/issues/298 | |||
Reporting | Certificate of Origin, Link Checker | https://github.com/opensearch-project/reporting/issues/661 | |||
Search Processor | Certificate of Origin, Link Checker, Create Documentation Issue | https://github.com/opensearch-project/search-processor/issues/104 | |||
Security | Code Hygiene, Certificate of Origin, Plugin install | https://github.com/opensearch-project/security/issues/2449 | |||
Security Analytics | Certificate of Origin, Link Checker, Security Test, Documentation Request, Benchmarking Tool | https://github.com/opensearch-project/security-analytics/issues/365 | |||
Security Analytics Dashboards | Certificate of Origin | https://github.com/opensearch-project/security-analytics-dashboards-plugin/issues/462 | |||
Security Dashboards | Code Hygiene, Certificate of Origin, Plugin install | https://github.com/opensearch-project/security-dashboards-plugin/issues/1337 | |||
Simple Schema | Link checker, Certificate of Origin | https://github.com/opensearch-project/simple-schema/issues/71 | |||
SQL | Checkstyle, Jacoco (100% coverage required), Comparison Tests, Link checker, Certificate of Origin | https://github.com/opensearch-project/sql/issues/1360 |
Follow-up from: SDK, SQL, Search Relevance, Dashboards Anywhere, OpenSearch-js, OpenSearch, OpenSearch Net, k-NN, Neural Search, OpenSearch Dashboards, Anomaly Detection, Benchmark, Build, helm-charts, opensearch-dashboards-functional-tests, OUI, ansible, Search Processor
Repositories Filed: 60
What do the outcome of this proposal look like, is it a process, a tool, or a product feature? How would we weight passible testing requirements from great ones? If we had those requirements and executed on them how would the project be better for it?
Hi @peternied, thank you for following up. The purpose of this discussion is to finalize a set of testing requirements that all development repositories will be required to maintain. This will be in the form of guidelines on the Developer Guide that state that repositories must require "x, y, and z" to have their code merged. Likewise, changes for the repositories will be expected to have those tests pass before they are merged. The hope is that this post will garner feedback (either here or directly) on what requirements people would like to see and then we can go from there.
As things stand there is a very wide range of requirements for testing across the organization. If Repository A requires code coverage of 85% and runs unit, integration, and backwards compatibility tests, they may expect Repository B does the same. However, because there is no standardized bar, Repository B instead has 98% code coverage but only has unit and integration tests. This causes a non-standard quality bar that creates a poor release experience and has resulted in features being released that don't meet user expectations.
You bring up a really good point in what the requirement could look like. The plan is to include it in documentation and require that the development repositories implement the requirements. That being said, a tool or GitHub workflow the checked the repository states could be helpful in keeping everyone accountable. The idea is not to be an auditor for repositories but instead to help ensure we all are on the same page. I spoke with quite a few people about what they would like to see different with releases and the quality of the changes was a recurring topic.
Thanks @scrawfor99 for driving this. To summarize, but also to make sure I understand:
The problem is uneven quality bar across the repositories in our organization
The goal is to come up with a set of global, cross-org recommendations for what that bar looks like
The process is first analyzing what the state of the world is now, use that to inform the discussion for what that bar should be, and then write that up into a global CONTRIBUTING.md
doc that projects can reference, potentially adding their own tweaks on top. After that bar is set, we can also think of providing automations/QoL improvements.
Does this capture what you're thinking too?
Hi @davidlago, you are exactly correct. I am a bit busy this week but am hoping to be able to file the same type of documenting issue for all repositories and then make a small, easy to digest analysis. After seeing where things are I will take further steps along the lines of my response to Peter and what you mentioned in your comment.
Hi @scrawfor99, I see BWC confirmation being added to dashboards plugin repo issues. Do we have BWC for dashboards plugins? I checked https://github.com/opensearch-project/opensearch-plugins/blob/main/TESTING.md#backwards-compatibility-testing but it seems only for opensearch plugins.
Hi @joshuali925, thank you for following up on this discussion. You are correct. To my knowledge there are no BWC tests for dashboards-plugins. The section in the table for BWC can effectively be ignored for dashboards plugins since it seems unlikely that that would ever be an expectation. It is not clear what backwards compatibility would entail for the frontend focused repos. It is in the table for consistency and readability.
This table provides the presence metrics for each of the tests documented in this process.
Test | Count (max is 59 overall/32 for backend only) | Percentage | Not Present In |
---|---|---|---|
Unit | 57 | 97% | Hadoop?, Ansible Playbook |
Integration | 56 | 95% | OUI, Hadoop?, Ansible Playbook |
BWC | 22 | 69% | Simple Schema, Performance Analyzer RCA, OpenSearch-rs OpenSearch Py-ML, OpenSearch-CLI, Hadoop, Helm Charts, Geospatial, Data Prepper, Common Utils, Ansible |
Certificate of Origin | 41 | 68% | Opensearch-SDK-Java, OpenSearch Ruby, OpenSearch PHP, OpenSearch JS, OpenSearch GO, OpenSearch Dashboards Functional Test, OpenSearch CLI, OpenSearch CI, OpenSearch Benchmark, Notifications, ML-Commons Dashboards, Helm Charts, Dashboards Notifications, Dashboards Maps, Dashboards Anywhere, Ansible |
Checkstyle, Lint Check, or Code Hygiene | 16 | 27% | Too many to list |
Link Checker | 33 | 56% | Too many to list |
Create Documentation Issue | 12 | 20% | Too many to list |
Deploy Documentation | 3 | 1% | Too many to list |
TL;DR: What do you want the organization to use as testing requirements? What certifies something is "done?"
What are you proposing?
A major topic discussed at the milestone meetings hosted by @CEHENKLE, has been establishing a universal quality bar. An actionable mechanism for the this is using GitHub actions and checks to assert changes meet expectations.
This issue should be used as a discussion board for what these expectations are. Everyone across the @opensearch-project, is encouraged to contribute as this is a change that will affect us all. The final result of this discussion will be a "universal" set of expectations which all repositories will employ. This means that maintainers will have an established quality bar for merging changes, and that contributors can know all requirements before they start working.
What users have asked for this feature?
The milestone group worked with numerous repositories to learn their pain points--the things that were preventing them from releasing on time and making a great product. One of the consistent patterns was a lack of established expectations. One group wanted changes to have "a, b, and c," while another wanted "x, y, and z" This proposal seeks to establish a set of accepted expectations across all repositories.
What is the contributor experience going to be?
Under the new system, a contributor will be able to reference the
CONTRIBUTING.md
document of any @opensearch-project repository and see the repository's expectations. Most of these expectations will be standardized across the organization but there may be additional requirements depending on each repositories use case. For example, the Security repository may require that the plugin install workflow passes but this would not be required for the whole organization.This discussion establishes a base quality bar across the organization. If any repository wants to add further requirements specific to them that is encouraged. They will need to list the additional requirements alongside the standardized ones in their
CONTRIBUTING.md
.How does this impact flaky tests?
Currently, we are trying to reduce the number of flaky tests which exist in the project. Unfortunately, this is easier said than done and the process is ongoing. Part of this discussion should be to establish expectations for handling existing flaky tests and it is likely that the new quality bar will expect no more flaky tests are introduced. A method of doing this is running all new tests in a separate workflow which is isolated from the older tests. This prevents any currently flaky tests from impacting newly added changes.
What will it take to implement?
In order for repositories to adopt the new quality bar, they should update their
CONTRIBUTING.md
document and their testing frameworks. Since the expectations are being standardized, it should be relatively easy to port between repositories. If tests are written as GitHub Actions they can likely be directly copied with minimal changes.Examples of some of the types of tests that may be helpful can be found in the identity project which has workflows for testing a wide range of functions on commit. There may be many more tests that should be mentioned however, and expectations can also include things like updated developer guides.
Any remaining open questions?
We are still trying to determine what these expectations should be so it is important that all opinions are heard. If you have any comments, concerns, or suggestions please share them below.