cockpit-project / cockpit

Cockpit is a web-based graphical interface for servers.
http://www.cockpit-project.org/
GNU Lesser General Public License v2.1
11.33k stars 1.13k forks source link

TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

Open cockpituous opened 2 weeks ago

cockpituous commented 2 weeks ago

The job fedora-40/updates-testing failed on commit f474f0da1c6f723b9f55b4b599b5468047fc8a59.

Log: https://cockpit-logs.us-east-1.linodeobjects.com/pull-0-f474f0da-20241109-013334-fedora-40-updates-testing/log.html

martinpitt commented 2 weeks ago

Confirmed in #21241 this seems to be real.

jelly commented 1 week ago

Relevant package updates:

 pcp                   x86_64 6.3.2-1.fc40                updates-testing 1.5 M
 pcp-conf              x86_64 6.3.2-1.fc40                updates-testing  30 k
 pcp-libs              x86_64 6.3.2-1.fc40                updates-testing 651 k
 pcp-selinux           x86_64 6.3.2-1.fc40                updates-testing  31 k

Previous version:

pcp-selinux-6.3.1-1.fc40.x86_64
pcp-conf-6.3.1-1.fc40.x86_64
pcp-libs-6.3.1-1.fc40.x86_64
pcp-6.3.1-1.fc40.x86_64
python3-pcp-6.3.1-1.fc40.x86_64

Changelog: https://github.com/performancecopilot/pcp/blob/main/CHANGELOG

Test assertion failure:

  File "/home/jelle/projects/cockpit/main/./test/verify/check-metrics", line 306, in testEvents
    self.assertGreaterEqual(getCompressedMinuteValue(test=self, g_type="cpu", saturation=True, hour=1598950800000, minute=8), 0.3)
AssertionError: 0.12799999713897706 not greater than or equal to 0.3

The metrics are historical so from a known dump so the values should not really change.

martinpitt commented 5 days ago

21295 should fix the textual bits. Making the pixel tests accomodate both versions is just too awful. https://bodhi.fedoraproject.org/updates/FEDORA-2024-22b21606d8 should hit updates in one or two days, then let's refresh the image and accept the new pixel refs. We can suffer through two more nightly failures. (There's also the Python crash, though)

mvollmer commented 5 days ago

I could reproduce the differences with pmval:

With pcp-6.3.1-1.fc40.x86_64:

# pmval -t 60 -a /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 08:59:46 2020
end:       Tue Sep  1 09:08:16 2020
semantics: instantaneous value
units:     none
samples:   9
interval:  60.00 sec
08:59:46.594  No values available

                 1 minute      5 minute     15 minute 
09:00:46.594       0.6400        0.1300     4.000E-02 
09:01:46.594       0.4600        0.1700     6.000E-02 
09:02:46.594       0.2900        0.1800     7.000E-02 
09:03:46.594       0.3300        0.2100     9.000E-02 
09:04:46.594      32.61          7.640         2.540  
09:05:46.594      15.22          7.620         2.880  
09:06:46.594      50.38         17.47          6.470  
09:07:46.594      20.93         15.34          6.440  

With pcp-6.3.2-2.el10.x86_64:

# pmval -t 60 -a /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 04:59:46 2020
end:       Tue Sep  1 05:08:16 2020
semantics: instantaneous value
units:     none
samples:   9
interval:  60.00 sec
04:59:46.594  No values available

                 1 minute      5 minute     15 minute 
05:00:46.594       0.4600        0.1700     6.000E-02 
05:01:46.594       0.2900        0.1800     7.000E-02 
05:02:46.594       0.3300        0.2100     9.000E-02 
05:03:46.594      32.61          7.640         2.540  
05:04:46.594      15.22          7.620         2.880  
05:05:46.594      50.38         17.47          6.470  
05:06:46.594      20.93         15.34          6.440  
05:07:46.594       7.680        12.54          6.040  

Note how the values jump up one minute later in the newer version.

Both versions show the same raw data in the archives, of course:

# pmval -U /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 04:59:46 2020
end:       Tue Sep  1 05:08:16 2020
semantics: instantaneous value
units:     none
samples:   all

                 1 minute      5 minute     15 minute 
04:59:46.609       0.6400        0.1300     4.000E-02 
05:00:46.617       0.4600        0.1700     6.000E-02 
05:01:46.621       0.2900        0.1800     7.000E-02 
05:02:46.630       0.3300        0.2100     9.000E-02 
05:03:46.620      32.61          7.640         2.540  
05:04:46.618      15.22          7.620         2.880  
05:05:46.623      50.38         17.47          6.470  
05:06:46.632      20.93         15.34          6.440  
05:07:46.627       7.680        12.54          6.040  

In fact, there isn't any interpolation of values going on. The sample time is rounded differently. In the old version 05:03:46.594 gets the raw data for 05:02:46.630, while in the new version, it gets the raw data for 05:03:46.620. The new version seems "more correct", the old version seems to round down, while the new version might round to nearest.

Anyway, no bug to report. We have to survive with both.

mvollmer commented 5 days ago

What about not doing pixel tests for updates-testing? We don't have a mechanism to cope with pixel alternatives and any failure in updates-testing will just be pain.

If we write code like

self.assert_pixels("dialog" if not updates_testing else "dialog-testing", ...)

then the machinery will yet at us that one of "dialog" or "dialog-testing" is unused... but we could invent something if we want to.

martinpitt commented 5 days ago

What about not doing pixel tests for updates-testing?

That crossed my mind, but I feel that would diminish their value too much. They did help us to find this difference (which could have been a regression). I wouldn't mind skipping this specific pixel test on updates-testing if you prefer. But honestly I currently lean towards dragging our feet a bit and then forgetting about it.

martinpitt commented 1 day ago

https://github.com/cockpit-project/bots/pull/7126 refreshes the image, but that fails on a handful of other issues. We'll update the pixel tests as part of that.