Publishing on-target results from vendors

Introduction

Executing (and evaluating) tests on hardware targets is a key factor in verifying the quality of Zephyr. However, it is not feasible to maintain a setup with dozens of various platforms, from multiple vendors, by the project itself. Many vendors are already performing on-target tests on their site. Therefore, we need a process defining how to synchronize our efforts, which will make results from on-target testing generally available.

Problem description

The process for results publishing was already defined before. However, with time, less and less results were published ending in a halt. I collected observations from several contributors, on what did and didn't work well. Based on those, I am proposing a plan on how to update the process, to make it successful (results being regularly published by vendors).

General requirements can be defined from Contributor's POV:

painless
- clear
- simple
- (easy to) automate
- timely constrained
ability to replace/withdraw
meaningful

Maintainer's POV:

standardize
safe
able to withdraw
traceable (who/what/where)
Proposed change

The proposal is nothing ground breaking. It is a collection of already existing ideas and motions. Many tools and workflows are already available but need more polish/upgrading.

The process can be divided into 3 stages (with corresponding tooling):

Execution (twister)
Publishing (scripts/github/database)
Analysis (dashboards/scripts)

Detailed RFC

Below are the slides I was presenting during Testing WG meeting. They contained a more detailed description on how the process can look like. On-target test results.pdf

Proposed change (Detailed)

I listed here actions which lMO needs to be addressed at first. Let's use this check list to track our progress. Independent issues can/will be opened and linked for more complicated tasks.

### Tasks
- [ ] Evaluating a helper script for getting the daily/weekly sha
- [ ] Rules for quarantine usage
- [x] Expand report with more info about an environment and flags used
- [ ] CONTRIBUTORS.yaml with vendor/contributor/platforms relations
- [ ] Mechanism how to replace/withdraw results
- [ ] Evaluate and improve verification of results
- [ ] Workflow for sending results to elastic
- [ ] Clear information on how useful the data will be
- [ ] config.yaml, identifying missing entries, and scope
- [ ] Script for automerging
- [ ] Script for automation of results publishing
- [ ] Workflow reminding contributors about missing data

Dependencies

The idea is to have the results publishing being connected with "tier 1" type platforms. Such results can also be taken into an account in the release process, providing more direct observables than a number of opened issues.

Concerns and Unresolved Questions (not already included in todo list)

Frequency commitment: at least weekly? Mandatory before a release? Should skips be in the reports?

Alternatives

Leaving the process as it is.

for 'config.yaml, identifying missing entries, and scope': here is the draft, can we start from this first? https://github.com/zephyrproject-rtos/zephyr/blob/main/tests/test_config.yaml

for 'Evaluating a helper script for getting the daily/weekly sha' I think it is there already? https://github.com/zephyrproject-rtos/test_results/blob/master/scripts/version_mgr.py

for 'Rules for quarantine usage' every quarantine must map to an issue

for 'Expand report with more info about an environment and flags used' this can be eveloved with time

for 'CONTRIBUTORS.yaml with vendor/contributor/platforms relations' now each platform has a vendor tag, is it enough? https://github.com/zephyrproject-rtos/zephyr/blob/main/boards/arm/frdm_k22f/frdm_k22f.yaml#L22

for 'Mechanism how to replace/withdraw results' @stephanosio do you have any idea on the how elasticsearch maintain this?

for 'Evaluate and improve verification of results' for now we have a basic check, maybe we can reuse it for now, https://github.com/zephyrproject-rtos/test_results/blob/master/scripts/results_verification.py

for 'Workflow for sending results to elastic' @nashif @stephanosio , when it the entry criteria for baord result be accepted by elasticserach?

for 'Clear information on how useful the data will be' I think we need to have a quality analysis virtual team on this. agree?

for 'Script for automerging' suppose this can be in second phase, as we may change the flow, totally automatic way may too idealistic to achieve

for 'Script for automation of results publishing' push is not enough?

for 'Workflow reminding contributors about missing data' if the report is generated by twister, very data will be in the statndard report. any other missing?

Besides I want to add 'report issue automtically' task, after every test result is push to test repo(https://github.com/zephyrproject-rtos/test_results), we can trigger a scan task, which will create issues in github/result repo. and we need to look and forward to zephyr(https://github.com/zephyrproject-rtos/zephyr) repo, maybe we shall tag it from CI board report. any comments? this is very useful to track the issue.

here is the draft, can we start from this first? https://github.com/zephyrproject-rtos/zephyr/blob/main/tests/test_config.yaml

yes, that's the idea. I will add usage of (a version of) this file in our internal on-target CI to check how it is working, and what extra configs will be useful (e.g. adding default platforms section there with tier 1 platforms)

for 'Evaluating a helper script for getting the daily/weekly sha': I think it is there already? https://github.com/zephyrproject-rtos/test_results/blob/master/scripts/version_mgr.py

The script is there, yes, but I meant it has to be evaluated if it will work as needed on local machines, since it was designed for upstream's CI. E.g. it tries to read version from a local file first, which shouldn't be the case.

for 'Rules for quarantine usage': every quarantine must map to an issue

I agree with this condition. For now local quarantine should be enough as we will start seeing people using quarantines we will consider adding also a global (maintained in a repo) one.

for 'Expand report with more info about an environment and flags used': this can be eveloved with time.

Indeed. However, some will have to be added before we enter the "production" stage of the process (especially those ones needed for traceability).

for 'CONTRIBUTORS.yaml with vendor/contributor/platforms relations': now each platform has a vendor tag, is it enough? https://github.com/zephyrproject-rtos/zephyr/blob/main/boards/arm/frdm_k22f/frdm_k22f.yaml#L22 I was thinking about a file, which will be helpful during the verification of uploaded data. A file with relations direct relations: vendor-> their tier 1 platfomrs and github ids of thrusted contributors assigned to the process. So a single place with data needed to validate who is responsible for which platforms. E.g. if I am thrusted to publish results from nrf52840dk or who to notify if those results are missing

https://github.com/zephyrproject-rtos/zephyr/pull/72399 resolves "Expand report with more info about an environment and flags used"

zephyrproject-rtos / zephyr