Closed mvollmer closed 2 weeks ago
The two testFrameReload
tests are suspicious: I don't know what requirement or feature they are supposed to test. They still pass, but maybe only by accident.
What they do:
1) modify the DOM directly to remove the "data-ready" attribute of a iframe 2) modify the DOM to change the "src" attrr of the iframe 3) wait for the "data-ready" attribute to come back
Step 2) might simulate the code inside the iframe changing its own location, but that is not something Cockpit supports beyond changing the hash. Both the old and new Shell will just react to the "load" event on the iframe, run their hash tracking stuff, and then put the correct "src" attribute back. The old Shell would run its polling loop, but the new Shell just brings the DOM into sync with its own reality.
So, without knowing what testFrameReload is actually testing, I don't know whether the new Shell broke something.
FTR, I'm afraid I don't know what that data-ready
hackery in the test should achieve. It goes way back to 2015:
It seems to me that we should test this using the "Active pages" killer in the session menu (with Alt) and then loading it back from the menu -- otherwise it's just internal hackery and meaningless IMHO.
It seems to me that we should test this using the "Active pages" killer in the session menu (with Alt) and then loading it back from the menu -- otherwise it's just internal hackery and meaningless IMHO.
Hmm, there is also the "Reload this frame" action in the browser menu. I wonder if we survive that.
https://github.com/cockpit-project/cockpit/commit/37aedec68f7b50cd13435920334c3c4ba6cef59b is worried about the router, and that a page needs to be unregistered when it is reloaded. (so that the channels open for the previous incarnation of the frame are closed,) I think this happens now via the "unload" event that the router listens to. I'll experiment a bit with "Reload this frame" as well.
I'll experiment a bit with "Reload this frame" as well.
Alright, reloading the frame would correctly unregister it from the router, but when it was loaded, we didn't do the necessary setup, like attaching the event handler for the idle timeout.
So let's attach our own permanent "load" and "unload" handlers and do the setup on every load. There is now also a teardown that unsets the ready flag, which helps with avoiding flicker in dark mode.
The testFrameReload tests should not touch the data-ready attribute, I'd say, but should instead check that open channels are closed properly. It would be best to reload the frame explicitly instead of messing with the "src" attribute. I would guess bidi can do that, right?
@mvollmer: Yes, bidi has a reload() API, we already use it in testlib.py's Browser.reload()
-- except that this always does self.switch_to_top()
first, i.e. it always reloads the entire page. You could just add an option to it to not do that, and then just have it reload the current frame.
You could just add an option to it to not do that, and then just have it reload the current frame.
I tried, but it seems to reload everything always. Also, I got testlib.Error: unknown error: navigation canceled
. Kicking the "src" attribute is not wrong, according to StackOverflow, so...
Let's get this in as a followup to #21012.
Step 2) might simulate the code inside the iframe changing its own location, but that is not something Cockpit supports beyond changing the hash.
Just to clarify: step 2 is very likely only supposed to reload the frame without changing the url, not simulate loading a different url inside the frame.
TestPages.testBasic and TestPages.testHistory are flaking pretty consistently on Firefox now. I'll check this out more.
TestPages.testBasic and TestPages.testHistory are flaking pretty consistently on Firefox now. I'll check this out more.
Also testMenuSearch. I hope we have only uncovered existing races in the tests.
TestPages.testHistory likely runs into https://bugzilla.mozilla.org/show_bug.cgi?id=941146.
This is the beginning of the backtrace, our terminal is using a canvas to render.
_fillCharTrueColor http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
_renderBlockCursor http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
_render http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
value http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
_animationFrame http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
restartBlinkAnimation http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
handleGridChanged http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
renderRows http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:8
_renderRows http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
_renderDebouncer http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
_innerRefresh http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
_animationFrame http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
refresh http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
refreshRows http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
refresh http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
V http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:12
fire http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:14
parse http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:14
_writeBuffer http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:14
_innerWrite http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:15
write http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:15
setTimeout handler*write http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:15
write http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:14
write http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:15
onChannelMessage http://127.0.0.2:9091/cockpit/@localhost/system/terminal.js:16
Ie http://127.0.0.2:9091/cockpit/@localhost/base1/cockpit.js:1
TestPages.testHistory likely runs into https://bugzilla.mozilla.org/show_bug.cgi?id=941146.
-> we need a naughty for this.
The workaround for TestPages.testBasic is successfull: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21061-c74ae9c4-20241016-071913-fedora-40-firefox-expensive/log.html
There was likely something racy about reloading the page (for the language change) and navigating at about the same time.
TestPages.testHistory likely runs into https://bugzilla.mozilla.org/show_bug.cgi?id=941146.
-> we need a naughty for this.
Perhaps just skip the test (or that part of the test) on Firefox?
Same fix as for testBasic also works for testMenuSearch: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21061-0c1cea44-20241016-075719-fedora-40-firefox-expensive/log.html
Perhaps just skip the test (or that part of the test) on Firefox?
Nah, it's a good test for tricky business (getting browser history right) and we really should run it on Firefox. It does reach the end successfully. There is just the occasional oops. We could allow oopses, but that would be too broad.
We could invent machinery that ala allow_journal_messages, that allows specific oopses, but this here really is a bug in Firefox that might conceivably be fixed some day.
The naughty landed -- so a retry of the amplified run should go green now, with the naughty hitting?
- [ ] try to debug #18766 once more
Moved to #21125
The naughty landed -- so a retry of the amplified run should go green now, with the naughty hitting?
In principle yes, but the naughty pattern is too tight and doesn't trigger on a test named testHistory6...
Hmm, TestPages.testMenuSearch on ubuntu also flakes. Some service error show up later in the test although they have been cleared at the beginning. Let's fix that as well...
Hmm, TestPages.testMenuSearch on ubuntu also flakes. Some service error show up later in the test although they have been cleared at the beginning. Let's fix that as well...
It's fwupd-refresh.service that runs on a timer. It probly fails because it has no network connection.
Let's just reset the failure state immediately before the screenshot, that should make the time window impossible small to hit....
Thanks for tracking that down. That also seems fine to disable in test/vm.install.
That also seems fine to disable in test/vm.install.
Yeah, but then it's a different service that fails later on...
Let's just reset the failure state immediately
Ah, you meant a global systemctl reset-failed
. Yes, that seems right.
Oh no, the naughty pattern doesn't match, probably because the text it matches against doesn't actually contain the # testHistory
header... Without that header, the pattern is very unspecific, hmm.
Oh no, the naughty pattern doesn't match, probably because the text it matches against doesn't actually contain the
# testHistory
header... Without that header, the pattern is very unspecific, hmm.
Luckily @martinpitt figured it out. Extra whitespace, fixed.