buildbot / buildbot

Python-based continuous integration testing framework; your pull requests are more than welcome!
https://www.buildbot.net
GNU General Public License v2.0
5.24k stars 1.62k forks source link

Feature discussion: Store and view the results of various tests #5164

Open p12tic opened 4 years ago

p12tic commented 4 years ago

Having test results being shown somewhere easily accessible from a build page would be a great productivity boost. Currently people need to dig into the logs. If we tracked what tests fail on what builds, we would have a great deal of useful information that would save people a lot of time.

The feature should implement the following use cases:

p12tic commented 4 years ago

Proposed database schema

(will be edited as the discussion goes on)

The use cases that need to be supported by the feature have a lot in common, but at the same time there are significant differences of what the users of BuildBot could reasonably expect. This requires the design to be generic enough.

The following is proposed pseudo-schema of the new tables. The schema is slightly denormalized so that performing queries does not introduce too much table joins. In the schema below, pk is primary key, fk is foreign key.

TestResultSet table:
 - (maybe) project_id (int, fk)
 - builder_id (int, fk)
 - build_id (int, fk)
 - step_id (int, fk)
 - testresultset_id (int, pk)
 - testresultset_type (str)
 - testresultset_value_unit (str)

A TestResultSet is an entity that represents all interesting information of a particular type that is produced by a step. For example, this could be a set of code warnings, or a set of performance results. The TestResultSet table stores information related to a TestResultSet. Additionally, the table also includes project_id, builder_id and build_id fields so that it's possible to easily query for all TestResultSets for a particular project, builder or build. Finally, we will be able to create a clustered index over project_id, builder_id, build_id, step_id which will move related test data together in the table and thus allow very large table sizes without affecting performance too much.

TestResultSetData table:
 - testresultset_id (int, fk)
 - data_type (str)
 - data (blob)

This table stores the unparsed data that produces a complete TestResultSet. TestResultSetData forms 0..many relationship to TestResultSet.

TestCodePath table:
 - builder_id (int, fk)
 - filepath_id (int, pk)
 - filepath (str)

This table stores the file paths for TestResult. builder_id is included to be able to apply a clustered index on it and move related data together in the table. It is expected that this table will be queried for all test paths related to a builder.

TestName table:
 - builder_id (int, fk)
 - testname_id (int, pk)
 - testname (str)

This table stores the test names for TestResult. builder_id is included to be able to apply a clustered index on it and move related data together in the table. It is expected that this table will be queried for all test names related to a builder.

TestResult table:
 - builder_id (int, fk)
 - testresultset_id (int, fk)
 - testresult_id (int, pk)
 - testname_id (int, fk, nullable)
 - filepath_id (int, fk, nullable)
 - line (int)
 - col (int)
 - value (???)

This table stores the actual test results. builder_id is included to be able to apply a clustered index on builder_id and testresultset_id and move related data together in the table. The table includes all information that could be possibly useful to a test, even the potentially unneeded data. For example, "code issues" tests will probably not use value field, whereas pass/fail or performance tests will probably not use the file path information.

The interpretation of the value field depends testresultset_type and testresultset_value_unit of the particular TestResultSet.

tardyp commented 4 years ago

Hi, this looks great. I am not sure of the necessity of the TestResultSet, and TestResultSetData tables. This sound a bit redundant as those data should be in the logs already. why store them in unparsed format.

Actually commenting in the issue in a bit awkard. maybe you can send a WIP PR, with model.py updated, and with 4 new raml files describing the data model from REST api point of view. having both data model reasoned at the same time sounds useful for me.

being able to put inline comments also looks very useful.

Also, having some example test data would help me understand your means.

bdbaddog commented 4 years ago

Allowing the user to upload a junit or similar result file would be very useful. Many test infastructures already generate such.