opensearch-project / .github

Provides templates and resources for other OpenSearch project repositories.
Apache License 2.0
30 stars 70 forks source link

[PROPOSAL & DISCUSSION] Organization-wide Testing Requirements #135

Open stephen-crawford opened 1 year ago

stephen-crawford commented 1 year ago

TL;DR: What do you want the organization to use as testing requirements? What certifies something is "done?"

What are you proposing?

A major topic discussed at the milestone meetings hosted by @CEHENKLE, has been establishing a universal quality bar. An actionable mechanism for the this is using GitHub actions and checks to assert changes meet expectations.

This issue should be used as a discussion board for what these expectations are. Everyone across the @opensearch-project, is encouraged to contribute as this is a change that will affect us all. The final result of this discussion will be a "universal" set of expectations which all repositories will employ. This means that maintainers will have an established quality bar for merging changes, and that contributors can know all requirements before they start working.

What users have asked for this feature?

The milestone group worked with numerous repositories to learn their pain points--the things that were preventing them from releasing on time and making a great product. One of the consistent patterns was a lack of established expectations. One group wanted changes to have "a, b, and c," while another wanted "x, y, and z" This proposal seeks to establish a set of accepted expectations across all repositories.

What is the contributor experience going to be?

Under the new system, a contributor will be able to reference the CONTRIBUTING.md document of any @opensearch-project repository and see the repository's expectations. Most of these expectations will be standardized across the organization but there may be additional requirements depending on each repositories use case. For example, the Security repository may require that the plugin install workflow passes but this would not be required for the whole organization.

This discussion establishes a base quality bar across the organization. If any repository wants to add further requirements specific to them that is encouraged. They will need to list the additional requirements alongside the standardized ones in their CONTRIBUTING.md.

How does this impact flaky tests?

Currently, we are trying to reduce the number of flaky tests which exist in the project. Unfortunately, this is easier said than done and the process is ongoing. Part of this discussion should be to establish expectations for handling existing flaky tests and it is likely that the new quality bar will expect no more flaky tests are introduced. A method of doing this is running all new tests in a separate workflow which is isolated from the older tests. This prevents any currently flaky tests from impacting newly added changes.

What will it take to implement?

In order for repositories to adopt the new quality bar, they should update their CONTRIBUTING.md document and their testing frameworks. Since the expectations are being standardized, it should be relatively easy to port between repositories. If tests are written as GitHub Actions they can likely be directly copied with minimal changes.

Examples of some of the types of tests that may be helpful can be found in the identity project which has workflows for testing a wide range of functions on commit. There may be many more tests that should be mentioned however, and expectations can also include things like updated developer guides.

Any remaining open questions?

We are still trying to determine what these expectations should be so it is important that all opinions are heard. If you have any comments, concerns, or suggestions please share them below.

stephen-crawford commented 1 year ago

I am going to go ahead and get this ball rolling.

Some things I would be interested in seeing are:

  1. New tests auto-run: Keep the existing gradle check, but add a workflow that automatically scanned for new test files (gradle + GitHub should be able to do this) and then runs those tests separately. This would let you know whether there was an issue with the test you just wrote or something related to how the tests interact with the older test suite. @cwperks wrote a file that does something along these lines over in the Identity project, and it has worked well for us. I think taking Craig's idea and going one step further could be helpful for a lot of repos.

  2. Dashboards auto-push to functional repo: I have heard from some dashboards-minded folks that the functional tests repo does not always get individual repos' tests. I am not much of a dashboards expert myself so do not know what is required here, but it sounds like adding a check to make sure that all tests are added to the functional testing repo would be helpful.

  3. Documentation check: Something else that I heard was that there are some challenges keeping documentation relevant and up-to-date. I think it could be helpful to add an auto-check like there is for the CHANGELOG that checked for documentation updates or required you to override it stating that your change only modified code irrelevant to feature execution. I.e. you updated a dependency or added an extra test.

stephen-crawford commented 1 year ago

To help establish guidelines, I am going to start going through each repository and reviewing their testing framework and confirming their current requirements. I will then keep a running document with each of the repos' requirements listed. From there, I will synthesize a set of requirements that should be established.

stephen-crawford commented 1 year ago
Repository Unit Tests Integration Tests Backwards Compatibility Tests Additional Tests Link
Alerting
  • - [x]
  • - [x]
  • - [x]
  • Security Test Workflow, Multi-node Test Workflow, Certificate of Origin https://github.com/opensearch-project/alerting/issues/803
    Alerting Dashboards
  • - [x]
  • - [x]
  • - [ ]
  • Create Documentation Issue, Certificate of Origin https://github.com/opensearch-project/alerting-dashboards-plugin/issues/491
    Anomaly Detection
  • - [x]
  • - [x]
  • - [x]
  • Link checker, AD benchmark, Certificate of Origin https://github.com/opensearch-project/anomaly-detection/issues/814
    Anomaly Detection Dashboards
  • - [x]
  • - [x]
  • - [ ]
  • Link checker, AD benchmark, PR Labeler, Certificate of Origin https://github.com/opensearch-project/anomaly-detection-dashboards-plugin/issues/428
    Ansible Playbook
  • - [ ]
  • - [ ]
  • - [ ]
  • https://github.com/opensearch-project/ansible-playbook/issues/117
    Asynchronous Search
  • - [x]
  • - [x]
  • - [x]
  • Certificate or Origin, Multi-node test https://github.com/opensearch-project/asynchronous-search/issues/243
    Common Utils
  • - [x]
  • - [x]
  • - [ ]
  • Certificate or Origin https://github.com/opensearch-project/common-utils/issues/368
    Cross Cluster Replication
  • - [x]
  • - [x]
  • - [x]
  • Certificate or Origin, Security Tests https://github.com/opensearch-project/cross-cluster-replication/issues/721
    Dashboards Anywhere
  • - [x]
  • - [x]
  • - [ ]
  • Health Check, Deployment Tests, Functional Tests https://github.com/opensearch-project/dashboards-anywhere/issues/153
    ~Dashboards Desktop~
  • - [ ]
  • - [ ]
  • - [ ]
  • https://github.com/opensearch-project/dashboards-desktop/issues/37
    Dashboards Maps
  • - [x]
  • - [x]
  • - [ ]
  • https://github.com/opensearch-project/dashboards-maps/issues/262
    Dashboards Notifications
  • - [ ]
  • - [x]
  • - [ ]
  • Lint Checker, Documentation Auto-cut https://github.com/opensearch-project/dashboards-notifications/issues/17
    Dashboards Observability
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, Certificate of Origin https://github.com/opensearch-project/dashboards-observability/issues/289
    Dashboards Query Workbench
  • - [x]
  • - [x]
  • - [ ]
  • Code QL, Lint Checker, Certificate of Origin, SQL Query Workbench https://github.com/opensearch-project/dashboards-query-workbench/issues/45
    Dashboards Reporting
  • - [x]
  • - [x]
  • - [ ]
  • Certificate or Origin, Reports Checker https://github.com/opensearch-project/dashboards-reporting/issues/65
    Dashboards Search Relevance
  • - [x]
  • - [x]
  • - [ ]
  • Certificate or Origin, Changelog Verifier, Lint Checker https://github.com/opensearch-project/dashboards-search-relevance/issues/152
    Dashboards Visualizations
  • - [x]
  • - [x]
  • - [ ]
  • Certificate or Origin, Link Checker https://github.com/opensearch-project/dashboards-visualizations/issues/167
    Data Prepper
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Create Document Issue, Performance Tests Compile, App Check, Trace Analytics Tests https://github.com/opensearch-project/data-prepper/issues/2302
    Geospatial
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker https://github.com/opensearch-project/geospatial/issues/227
    Helm Charts
  • - [x]
  • - [x]
  • - [ ]
  • Lint Checker https://github.com/opensearch-project/helm-charts/issues/3858
    Index Management
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Link Checker, Lint Checker, Security Test, Docker Security Test, Multi-node Test https://github.com/opensearch-project/index-management/issues/698
    Index Management Dashboards Plugin
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, Certificate of Origin https://github.com/opensearch-project/index-management-dashboards-plugin/issues/629
    Job Scheduler
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Link Checker https://github.com/opensearch-project/job-scheduler/issues/327
    k-NN
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Link Checker, Documentation Request, Benchmarking Tool, Custom Benhcmarking Tool https://github.com/opensearch-project/k-NN/issues/775
    ML-Commons
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Documentation Request https://github.com/opensearch-project/ml-commons/issues/752
    ML-Commons Dashboards
  • - [x]
  • - [ ]
  • - [ ]
  • Lint Checker https://github.com/opensearch-project/ml-commons-dashboards/issues/127
    Neural Search
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, Benchmarking Tool (in progress) https://github.com/opensearch-project/neural-search/issues/124
    Notifications
  • - [ ]
  • - [x]
  • - [ ]
  • Link checker https://github.com/opensearch-project/notifications/issues/622
    Observability
  • - [x]
  • - [x]
  • - [x]
  • Link checker, Certificate of Origin, Enforce PR Labels https://github.com/opensearch-project/observability/issues/1416
    OpenSearch
  • - [x]
  • - [x]
  • - [ ]
  • Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Create Documentation Issue, ShellCheck, Validate Gradle Wrapper https://github.com/opensearch-project/OpenSearch/issues/6389
    OpenSearch Benchmark
  • - [x]
  • - [x]
  • - [ ]
  • https://github.com/opensearch-project/opensearch-benchmark/issues/222
    OpenSearch Build
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, License Header, Groovy Tests, Yaml Lint, Python Tests https://github.com/opensearch-project/opensearch-build/issues/3241
    OpenSearch CI
  • - [x]
  • - [x]
  • - [ ]
  • https://github.com/opensearch-project/opensearch-ci/issues/254
    OpenSearch CLI
  • - [x]
  • - [x]
  • - [ ]
  • https://github.com/opensearch-project/opensearch-cli/issues/75
    OpenSearch Dashboards
  • - [x]
  • - [x]
  • - [ ]
  • Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Document Link Checker https://github.com/opensearch-project/OpenSearch-Dashboards/issues/3466
    OpenSearch Dashboards Functional Test
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, Lint, Bundled Tests, tests added from other repositories https://github.com/opensearch-project/opensearch-dashboards-functional-test/issues/546
    ~OpenSearch DSL-PY~
  • - [x]
  • - [x]
  • - [ ]
  • Changelog Verifier, Link Checker https://github.com/opensearch-project/opensearch-dsl-py/issues/101
    OpenSearch Go
  • - [x]
  • - [x]
  • - [ ]
  • License Headers, Changelog Verifier, Link Checker, Lint Checker https://github.com/opensearch-project/opensearch-go/issues/238
    OpenSearch Hadoop
  • - [ ]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, Changelog Verifier https://github.com/opensearch-project/opensearch-hadoop/issues/123
    OpenSearch Java
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, Lint Checker, Checkstyle https://github.com/opensearch-project/opensearch-java/issues/378
    OpenSearch-JS
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, License, License Header, Changelog Verifier https://github.com/opensearch-project/opensearch-js/issues/390
    OpenSearch Net
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, Deploy Documentation, License Headers, Changelog Verifier https://github.com/opensearch-project/opensearch-net/issues/157
    OpenSearch PHP
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, Changelog Verifier, Update Documentation https://github.com/opensearch-project/opensearch-php/issues/127
    OpenSearch Py
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker, License Headers, Changelog, Deploy Doc https://github.com/opensearch-project/opensearch-py/issues/304
    OpenSearch Py-ML
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Deploy Doc https://github.com/opensearch-project/opensearch-py-ml/issues/88
    OpenSearch-rs
  • - [x]
  • - [x]
  • - [ ]
  • Changelog verifier, Link Checker, Clippy Check, Certificate of Origin https://github.com/opensearch-project/opensearch-rs/issues/127
    OpenSearch Ruby
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, License Headers, Publish Docs https://github.com/opensearch-project/opensearch-ruby/issues/148
    Opensearch-SDK-Java
  • - [ ]
  • - [ ]
  • - [x]
  • Validate Gradle Wrapper https://github.com/opensearch-project/opensearch-sdk-java/issues/470
    OUI
  • - [x]
  • - [ ]
  • - [ ]
  • Certificate Checker, Lint checker https://github.com/opensearch-project/oui/issues/331
    Performance Analyzer
  • - [x]
  • - [x]
  • - [x]
  • Task List Checker, Link Checker, Certificate of Origin, Create Doc Issue https://github.com/opensearch-project/performance-analyzer/issues/391
    Performance Analyzer RCA
  • - [x]
  • - [x]
  • - [ ]
  • Link Checker, Gauntlets Test, Certificate of Origin, Create Doc Issue https://github.com/opensearch-project/performance-analyzer-rca/issues/298
    Reporting
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin, Link Checker https://github.com/opensearch-project/reporting/issues/661
    Search Processor
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Link Checker, Create Documentation Issue https://github.com/opensearch-project/search-processor/issues/104
    Security
  • - [x]
  • - [x]
  • - [x]
  • Code Hygiene, Certificate of Origin, Plugin install https://github.com/opensearch-project/security/issues/2449
    Security Analytics
  • - [x]
  • - [x]
  • - [x]
  • Certificate of Origin, Link Checker, Security Test, Documentation Request, Benchmarking Tool https://github.com/opensearch-project/security-analytics/issues/365
    Security Analytics Dashboards
  • - [x]
  • - [x]
  • - [ ]
  • Certificate of Origin https://github.com/opensearch-project/security-analytics-dashboards-plugin/issues/462
    Security Dashboards
  • - [x]
  • - [x]
  • - [ ]
  • Code Hygiene, Certificate of Origin, Plugin install https://github.com/opensearch-project/security-dashboards-plugin/issues/1337
    Simple Schema
  • - [x]
  • - [x]
  • - [ ]
  • Link checker, Certificate of Origin https://github.com/opensearch-project/simple-schema/issues/71
    SQL
  • - [x]
  • - [x]
  • - [x]
  • Checkstyle, Jacoco (100% coverage required), Comparison Tests, Link checker, Certificate of Origin https://github.com/opensearch-project/sql/issues/1360
    stephen-crawford commented 1 year ago

    Follow-up from: SDK, SQL, Search Relevance, Dashboards Anywhere, OpenSearch-js, OpenSearch, OpenSearch Net, k-NN, Neural Search, OpenSearch Dashboards, Anomaly Detection, Benchmark, Build, helm-charts, opensearch-dashboards-functional-tests, OUI, ansible, Search Processor

    Repositories Filed: 60

    peternied commented 1 year ago

    What do the outcome of this proposal look like, is it a process, a tool, or a product feature? How would we weight passible testing requirements from great ones? If we had those requirements and executed on them how would the project be better for it?

    stephen-crawford commented 1 year ago

    Hi @peternied, thank you for following up. The purpose of this discussion is to finalize a set of testing requirements that all development repositories will be required to maintain. This will be in the form of guidelines on the Developer Guide that state that repositories must require "x, y, and z" to have their code merged. Likewise, changes for the repositories will be expected to have those tests pass before they are merged. The hope is that this post will garner feedback (either here or directly) on what requirements people would like to see and then we can go from there.

    As things stand there is a very wide range of requirements for testing across the organization. If Repository A requires code coverage of 85% and runs unit, integration, and backwards compatibility tests, they may expect Repository B does the same. However, because there is no standardized bar, Repository B instead has 98% code coverage but only has unit and integration tests. This causes a non-standard quality bar that creates a poor release experience and has resulted in features being released that don't meet user expectations.

    You bring up a really good point in what the requirement could look like. The plan is to include it in documentation and require that the development repositories implement the requirements. That being said, a tool or GitHub workflow the checked the repository states could be helpful in keeping everyone accountable. The idea is not to be an auditor for repositories but instead to help ensure we all are on the same page. I spoke with quite a few people about what they would like to see different with releases and the quality of the changes was a recurring topic.

    davidlago commented 1 year ago

    Thanks @scrawfor99 for driving this. To summarize, but also to make sure I understand:

    The problem is uneven quality bar across the repositories in our organization The goal is to come up with a set of global, cross-org recommendations for what that bar looks like The process is first analyzing what the state of the world is now, use that to inform the discussion for what that bar should be, and then write that up into a global CONTRIBUTING.md doc that projects can reference, potentially adding their own tweaks on top. After that bar is set, we can also think of providing automations/QoL improvements.

    Does this capture what you're thinking too?

    stephen-crawford commented 1 year ago

    Hi @davidlago, you are exactly correct. I am a bit busy this week but am hoping to be able to file the same type of documenting issue for all repositories and then make a small, easy to digest analysis. After seeing where things are I will take further steps along the lines of my response to Peter and what you mentioned in your comment.

    joshuali925 commented 1 year ago

    Hi @scrawfor99, I see BWC confirmation being added to dashboards plugin repo issues. Do we have BWC for dashboards plugins? I checked https://github.com/opensearch-project/opensearch-plugins/blob/main/TESTING.md#backwards-compatibility-testing but it seems only for opensearch plugins.

    stephen-crawford commented 1 year ago

    Hi @joshuali925, thank you for following up on this discussion. You are correct. To my knowledge there are no BWC tests for dashboards-plugins. The section in the table for BWC can effectively be ignored for dashboards plugins since it seems unlikely that that would ever be an expectation. It is not clear what backwards compatibility would entail for the frontend focused repos. It is in the table for consistency and readability.

    stephen-crawford commented 1 year ago

    Statistics based off of findings

    This table provides the presence metrics for each of the tests documented in this process.

    Test Count (max is 59 overall/32 for backend only) Percentage Not Present In
    Unit 57 97% Hadoop?, Ansible Playbook
    Integration 56 95% OUI, Hadoop?, Ansible Playbook
    BWC 22 69% Simple Schema, Performance Analyzer RCA, OpenSearch-rs OpenSearch Py-ML, OpenSearch-CLI, Hadoop, Helm Charts, Geospatial, Data Prepper, Common Utils, Ansible
    Certificate of Origin 41 68% Opensearch-SDK-Java, OpenSearch Ruby, OpenSearch PHP, OpenSearch JS, OpenSearch GO, OpenSearch Dashboards Functional Test, OpenSearch CLI, OpenSearch CI, OpenSearch Benchmark, Notifications, ML-Commons Dashboards, Helm Charts, Dashboards Notifications, Dashboards Maps, Dashboards Anywhere, Ansible
    Checkstyle, Lint Check, or Code Hygiene 16 27% Too many to list
    Link Checker 33 56% Too many to list
    Create Documentation Issue 12 20% Too many to list
    Deploy Documentation 3 1% Too many to list