cpan-testers / cpantesters-backend

Backend data processing for CPAN Testers
Other
0 stars 4 forks source link

Migrate release data script to new backend module #12

Open preaction opened 6 years ago

preaction commented 6 years ago

We need to migrate the release data script to the new backend module system. The old code is located in old/release.

This code does the following:

  1. Update the release_data table to contain a single row for every test summary row in the cpanstats table. This row contains a 1 in one of the test grade columns (pass, fail, na, unknown).
  2. Update the release_summary table to contain a single entry for every dist+version (a release). This row should contain a count of all the pass, fail, na, unknown reports for the release.

For example, here's the difference between the data in the two tables:

mysql> select * from release_data limit 5;
+------------------+---------+----+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
| dist             | version | id | guid                                 | oncpan | distmat | perlmat | patched | pass | fail | na   | unknown | uploadid |
+------------------+---------+----+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
| FCGI             | 0.48    |  6 | 00000006-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    0 |    0 |    0 |       1 |    35945 |
| FCGI             | 0.48    |  7 | 00000007-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    0 |    0 |    0 |       1 |    35945 |
| HTML-EP          | 0.1133  |  9 | 00000009-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    1 |    0 |    0 |       0 |    55778 |
| HTML-EP          | 0.1133  | 12 | 00000012-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    1 |    0 |    0 |       0 |    55778 |
| HTML-EP-Explorer | 0.1004  | 13 | 00000013-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    1 |    0 |    0 |       0 |    55847 |
+------------------+---------+----+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
5 rows in set (0.12 sec)

mysql> select * from release_summary limit 5;
+--------------------------+---------+----------+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
| dist                     | version | id       | guid                                 | oncpan | distmat | perlmat | patched | pass | fail | na   | unknown | uploadid |
+--------------------------+---------+----------+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
| Gtk2-Ex-VolumeButton     | 0.07    | 88850420 | 7b4ccec6-c15c-11e7-b84c-caea5916b984 |      1 |       1 |       1 |       1 |  154 |    2 |    0 |       1 |    98343 |
| Image-VisualConfirmation | 0.01    |   318400 | 00318400-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    2 |    0 |    0 |       0 |    68717 |
| Image-VisualConfirmation | 0.02    |   338094 | 00338094-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       1 |       1 |    4 |    0 |    0 |       0 |    68716 |
| Image-VisualConfirmation | 0.03    | 11270376 | 8d91db78-8ac4-1014-b692-942c2553c65e |      2 |       1 |       1 |       1 |    5 |  113 |    2 |       4 |    68713 |
| Image-VisualConfirmation | 0.03    |  6976542 | 06976542-b19f-3f77-b713-d32bba55d77f |      2 |       1 |       2 |       1 |    1 |    7 |    0 |       0 |    68713 |
+--------------------------+---------+----------+--------------------------------------+--------+---------+---------+---------+------+------+------+---------+----------+
5 rows in set (0.10 sec)

All of the tables required by this script are already managed by CPAN::Testers::Schema:

The only useful end result in the data in the release_summary table. If we can build this table without leaving intermediary data around, that would be great.

This code should be in a new module, CPAN::Testers::Backend::Release. This module should use the Beam::Runnable role. The process should be configured in etc/container/report.yml. It should be configured to run every hour via cron in the Rexfile. Then the old process should be disabled to prevent erroneous data.

In addition to updating the newly-added data, there should be an option to rebuild all of this data from scratch, completely clearing all derived data in release_data and release_summary and building them back up.

preaction commented 6 years ago

The existing code is broken subtly: It considers a distribution to be a "dev" release (distmat) only if the version contains a "_". This is incorrect, there are also -TRIAL releases. The CPAN::DistnameInfo does this correctly, and we should use that.