beeware / toga

A Python native, OS native GUI toolkit.
https://toga.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
4.19k stars 655 forks source link

Run testbed with Wayland #2670

Closed rmartin16 closed 1 week ago

rmartin16 commented 1 week ago

Changes

Notes

PR Checklist:

rmartin16 commented 1 week ago

This is the approach I landed on today. Feel free to let me know if this strategy makes sense to you...or doesn't. From here, the failing tests just need to be addressed one way or another. May also benefit from combining the duplicated matrix fields for the two Linux jobs.

freakboy3742 commented 1 week ago

I guess the thing that doesn't make sense is the use of xvfb-run... are you sure it's actually running as Wayland? There's a bunch of functionality that should be tested (like getting an image of the current screen), but that isn't showing up as test failure as I'd expect.

Other than that (and the other miscellaneous test failures), the approach looks fine to me.

rmartin16 commented 1 week ago

I guess the thing that doesn't make sense is the use of xvfb-run... are you sure it's actually running as Wayland?

I'm relatively confident. If you run mutter without xvfb-run in a local Linux distro install, it spawns a window for the Wayland display and you can watch the testbed tests run. Furthermore, all the test failures align with the failures in a natively Wayland environment like Fedora 40.

There's a bunch of functionality that should be tested (like getting an image of the current screen), but that isn't showing up as test failure as I'd expect.

Are you thinking of this one or a different one?

FAILED tests/widgets/test_canvas.py::test_multiline_text - Failed: Rendered image doesn't match reference (RMSE==0.11368523797456337)

In general, I was surprised this ended up requiring xvfb...but all of the other approaches required running a Wayland compositor implementation in its headless mode....and Gtk just does not like that currently...but running the compositor in X via xvfb avoids the headless mode and all its problems...

freakboy3742 commented 1 week ago

There's a bunch of functionality that should be tested (like getting an image of the current screen), but that isn't showing up as test failure as I'd expect.

Are you thinking of this one or a different one?

FAILED tests/widgets/test_canvas.py::test_multiline_text - Failed: Rendered image doesn't match reference (RMSE==0.11368523797456337)

That test failure is somewhat expected - the canvas rendering tests are highly platform dependent. I was thinking more of this test - based on the implementation, that shouldn't be able to pass.

rmartin16 commented 1 week ago

ahh...in that case, pytest appears to be skipping the test.

tests/app/test_screens.py::test_as_image SKIPPED (Screen.as_image() is not implemented on wayland.)     [  4%]
freakboy3742 commented 1 week ago

ahh...in that case, pytest appears to be [skipping]

Huh - that's some forward thinking that I don't remember :-)

rmartin16 commented 1 week ago

Along with allowing for testing Wayland in CI, this should also allow devs to more easily run the testbed tests locally.

However, multiple monitor setups where the primary monitor is not the far left monitor will likely still fail. This kinda makes me wonder if all the CI logic to set up the environment to run the testbed tests shouldn't be in tox instead...

rmartin16 commented 1 week ago

Couple questions I was thinking about:

1) Should we consider moving the setup logic in CI for running testbed in to tox? That would definitely simplify running it locally. 2) Should we consider splitting the testing sources that contain a large if block for mobile vs desktop?

freakboy3742 commented 1 week ago

Couple questions I was thinking about:

  1. Should we consider moving the setup logic in CI for running testbed in to tox? That would definitely simplify running it locally.

I guess it could be helpful. You don't need to run the actual CI configuration very often, but when you do, it would be nice to have an easy way to replicate it without needing to spelunk through the CI YAML to replicate it.

  1. Should we consider splitting the testing sources that contain a large if block for mobile vs desktop?

I've had the same thought myself recently. I've been breaking the dialog tests into their own modules for similar reasons; given the number of mobile vs desktop tests, it makes sense to break them out as well.