rust-gamedev / wg

Coordination repository of the Game Development Working Group
514 stars 10 forks source link

GPU testing on CI #31

Open kvark opened 5 years ago

kvark commented 5 years ago

Imagine you are developing a game targeting many platforms and hardware vendors. You want to ensure that as features get added, the game still works at least on a few configurations. You probably create some sort of an engine demo, or a reftest suite, to expose those features without running the full game. Or maybe you just have a game replay that you can load in a standard way. Next step - how do you run it continuously?

This is the problem many Rust gamedev projects face. The ecosystem would benefit from a standardized solution to do GPU testing on CI. In particular, gfx-rs, wgpu, and I imagine Amethyst would want to run tests on a variety of GPUs and platforms.

Some time ago I asked Bors community for this - https://forum.bors.tech/t/gpu-build-test-farm/103 . Basically, we need some combination of hardware and software to achieve this. It's doable, but it's hard to invest this if it's just a custom solution for one project. Perhaps, together, we can get this done?

Lokathor commented 5 years ago

I think that this is a really great idea.

I've started a post here three times now and abandoned it twice so far. Basically what I want to say is "I've never heard of this being easily available", but I also don't want to be discouraging.

I hope that you find something in this space, it'd be a big breakthrough.

hadronized commented 5 years ago

I’ve been thinking and wanting such a thing for a while now, too. Something with a headless mode in Vulkan / OpenGL. I’m interested for luminance. I remember a discussion with @nical when he told me they have something like that at Mozilla, but I’m not sure it’s open-source.

AlexEne commented 5 years ago

I think there was a good talk on this at fosdem 2018 https://fosdem.org/2019/schedule/event/igt_ci/

nical commented 5 years ago

All of the mozilla-specifics of our CI infra is open-source as far as I know.

Running tests on real hardware (as opposed to things like headless llvmpipe) is important to make sure we test a configuration that is close to what users get, but it's tricky to set up automation for and things will render slightly differently on each platform/gpu so you can't simply test that each pixel exactly match a reference image for example. In Firefox's case each reference test specify how many pixels can differ and by how much. If the hardware on the CI changes, reference tests often need to be updated (at least the fuzziness thresholds).

Marionnette is the name of (a part of?) this automation system but I don't know much about it. A big part of the issue I suppose is that a lot of cloud services where one would want to run these tests don't provide access to a GPU. A lot of Firefox's tests run on an infrastructure maintained by mozilla and maintaining this infra is a lot of work.

hecrj commented 5 years ago

I am also interested in this for https://github.com/hecrj/coffee/issues/16.

My plan is to use my own desktop PC as a Drone runner for the time being, running it only before merging changes into master.

I understand this strategy doesn't scale very well and doesn't test on different platforms/hardware, but maybe it could be a starting point for some of us?

kvark commented 5 years ago

@hecrj it doesn't scale. But there is still a great deal of value in automating and carefully describing the steps a person need to take to set this up similarly for their own projects (using their own hardware). If you happen to go through this process, please document the steps for us to try ;)

azriel91 commented 5 years ago

For Vulkan, I've managed to get CI going with my old laptop with a GPU. The short version:

  1. Get vulkan drivers working, test with vulkaninfo.
  2. Log into X with the CI user (which means, giving it a password).
  3. Export the DISPLAY variable in the build (e.g. DISPLAY=:0, some OSes use :1).
  4. Run CI tests as usual.

    Most of my project's tests that use vulkan work, but some do fail spuriously with Failed to create glyph texture: AllocationError(OutOfMemory(OutOfDeviceMemory)). This may be an issue with an old version of rendy, but I haven't gotten to upgrading yet.

    edit: preventing the screen from going blank prevents that error from happening.

The long version (all the frustrating failure cases, mainly my notes when experimenting):

https://gist.github.com/azriel91/a236daafa4e5be3b122745239d03596c

I know the "log in to X" part is unfeasible when you can't really touch the machine, but I spent an unhealthy amount of hours trying to do it via scripts to no avail.


For Open GL, software renderer works fine, got it running on gitlab's shared runners using xvfb with the GLX extension enabled. I've got scripts for this that let you go:

# handles running XVFB and exporting `DISPLAY`
xvfb_start

cargo test

# handles stopping XVFB -- clean up after yourself!
xvfb_stop

If interested, I can pull those scripts out of my repo at some point.

aktech commented 3 years ago

Hi I am the creator of Cirun.io, "GPU" and "CI" caught my eye.

FWIW I'll share my two cents. I created a service for problems like these, which is basically running custom machines (including GPUs) in GitHub Actions: https://cirun.io/

It is used in multiple open source projects needing GPU support like the following:

https://github.com/pystatgen/sgkit/ https://github.com/qutip/qutip-cupy

It is fairly simple to setup, all you need is a cloud account (AWS or GCP) and a simple yaml file describing what kind of machines you need and Cirun will spin up ephemeral machines on your cloud for GitHub Actions to run. It's native to GitHub ecosystem, which mean you can see logs/trigger in the Github's interface itself, just like any Github Action run.

Also, note that Cirun is free for Open source projects. (You only pay to your cloud provider for machine usage)