Open p12tic opened 4 years ago
(will be edited as the discussion goes on)
The use cases that need to be supported by the feature have a lot in common, but at the same time there are significant differences of what the users of BuildBot could reasonably expect. This requires the design to be generic enough.
The following is proposed pseudo-schema of the new tables. The schema is slightly denormalized so that performing queries does not introduce too much table joins. In the schema below, pk
is primary key, fk
is foreign key.
TestResultSet table:
- (maybe) project_id (int, fk)
- builder_id (int, fk)
- build_id (int, fk)
- step_id (int, fk)
- testresultset_id (int, pk)
- testresultset_type (str)
- testresultset_value_unit (str)
A TestResultSet
is an entity that represents all interesting information of a particular type that is produced by a step. For example, this could be a set of code warnings, or a set of performance results. The TestResultSet
table stores information related to a TestResultSet
. Additionally, the table also includes project_id
, builder_id
and build_id
fields so that it's possible to easily query for all TestResultSets
for a particular project, builder or build. Finally, we will be able to create a clustered index over project_id
, builder_id
, build_id
, step_id
which will move related test data together in the table and thus allow very large table sizes without affecting performance too much.
TestResultSetData table:
- testresultset_id (int, fk)
- data_type (str)
- data (blob)
This table stores the unparsed data that produces a complete TestResultSet
. TestResultSetData
forms 0..many relationship to TestResultSet.
TestCodePath table:
- builder_id (int, fk)
- filepath_id (int, pk)
- filepath (str)
This table stores the file paths for TestResult
. builder_id
is included to be able to apply a clustered index on it and move related data together in the table. It is expected that this table will be queried for all test paths related to a builder.
TestName table:
- builder_id (int, fk)
- testname_id (int, pk)
- testname (str)
This table stores the test names for TestResult
. builder_id
is included to be able to apply a clustered index on it and move related data together in the table. It is expected that this table will be queried for all test names related to a builder.
TestResult table:
- builder_id (int, fk)
- testresultset_id (int, fk)
- testresult_id (int, pk)
- testname_id (int, fk, nullable)
- filepath_id (int, fk, nullable)
- line (int)
- col (int)
- value (???)
This table stores the actual test results. builder_id
is included to be able to apply a clustered index on builder_id
and testresultset_id
and move related data together in the table. The table includes all information that could be possibly useful to a test, even the potentially unneeded data. For example, "code issues" tests will probably not use value
field, whereas pass/fail or performance tests will probably not use the file path information.
The interpretation of the value field depends testresultset_type
and testresultset_value_unit
of the particular TestResultSet
.
Hi, this looks great. I am not sure of the necessity of the TestResultSet, and TestResultSetData tables. This sound a bit redundant as those data should be in the logs already. why store them in unparsed format.
Actually commenting in the issue in a bit awkard. maybe you can send a WIP PR, with model.py updated, and with 4 new raml files describing the data model from REST api point of view. having both data model reasoned at the same time sounds useful for me.
being able to put inline comments also looks very useful.
Also, having some example test data would help me understand your means.
Allowing the user to upload a junit or similar result file would be very useful. Many test infastructures already generate such.
Having test results being shown somewhere easily accessible from a build page would be a great productivity boost. Currently people need to dig into the logs. If we tracked what tests fail on what builds, we would have a great deal of useful information that would save people a lot of time.
The feature should implement the following use cases:
Code issues:
Analyze logs produced by e.g. pylint or flake8
Submit review comments on third-party code browser such as GitHub, Gitlab, BitBucket, etc.
Store the test results in the database for analytics purposes
Test pass/fail results
Analyze test pass/fail logs produced by various testing frameworks
Track which tests fail since what commit. That would allow people to immediately know which commits are to blame. Further tooling on the BuildBot side could automatically start tests to bisect the failure.
Store the test results in the database for analytics purposes. E.g. this would allow to detect an area of the most unstable tests and give information about the root cause.
Test quantitative results
This use case covers any interesting numeric information coming from a test or some analysis tool. For example, performance metrics, binary size or memory usage metrics categorized by file or code area and so on.
Analyze logs produced by various testing frameworks or other tooling
Track the results of tests across time which allows detection of performance regressions.