Open alecandido opened 4 months ago
- since not all tests require CUDA, just mark those for which is necessary with
@pytest.mark.cuda
, and filter them inconftest.py
according to the value of the environment variable, withinpytest_runtest_setup
Yup, this @pytest.mark.cuda
marker sounds good to me. We can use it to cover CUDA-related pytest functions.
f you want to run this very same workflow file on a self-hosted runner, then you would have to do something more complex: I don't believe you can access the underlying env vars by default, since for the GitHub runners there are the Variables for that, and it is written explicitly in the docs for the
env
contextIt does not contain variables inherited by the runner process.
However, you could run a first preliminary job, read the environment variable within bash code, and set in the
$GITHUB_ENV
file, this should make it available in theenv
context (I'm not completely sure myself, I should test it)Option 1. is much more reasonable to me, because it is more flexible (you could still run some tests everywhere), it's defined in Python, so you're already familiar with it (instead of looking around in GitHub docs), and you can test it offline on various systems, without having to fiddle with GitHub actions.
Originally posted by @alecandido in #24 (review)
As for option 2., the self-hosted runner is desired for covering the tests of GPU backends. Actually, we are providing a GPU-server (most likely RTX3090) as the self-hosted runner. Then, we can use the preceding check
job to do the preliminary checking on CUDA availability. This part is already there in the code, now, the key is to set up the self-hosted runner in a proper way.
As for option 2., the self-hosted runner is desired for covering the tests of GPU backends. Actually, we are providing a GPU-server (most likely RTX3090) as the self-hosted runner. Then, we can use the preceding
check
job to do the preliminary checking on CUDA availability. This part is already there in the code, now, the key is to set up the self-hosted runner in a proper way.
Concerning the check
job, I believe it won't be needed any longer.
Every job should have a runs-on:
key, specifying where to run. If you wish to self-host, the value should be self-hosted
(or some other self-hosted label):
https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/using-self-hosted-runners-in-a-workflow#using-default-labels-to-route-jobs
https://github.com/qiboteam/qibolab/blob/d080141c9a4fd2160e5b159205eded3256333c75/.github/workflows/selfhosted.yml#L46
So, you can now add a matrix
, and run the exact same job on both runners.
However, using the check
job, you can now determine the environment variable. But then you want to run on both anyhow, because the CPU will run on GitHub, and CPU + GPU (or GPU only, in case we wanted to save resources) on the self-hosted.
Thus, in this setup it's not much useful to have a check
job, because there are no jobs to filter at the workflow level, if you can filter tests at the Pytest level (and you can, since in Pytest you'll be able to access the environment).
Simply, if you want to share the build job in both (and it would make sense) the mark should not be added in the workflow, but it should be resolved in a Pytest hook.
If you're worried instead about the actual build
(that job is called like that by accident, propagating the content of workflows@v1
, fixed in v2
), that has a separate workflow, and it does not require GPU. So, it will keep building on GitHub, as it is doing right now :)
Thus, in this setup it's not much useful to have a
check
job, because there are no jobs to filter at the workflow level, if you can filter tests at the Pytest level (and you can, since in Pytest you'll be able to access the environment). Simply, if you want to share the build job in both (and it would make sense) the mark should not be added in the workflow, but it should be resolved in a Pytest hook.
I see. So now, the key is using Pytest level for filtering tests, instead of a preceding check
job. Fair enough.
If you're worried instead about the actual
build
(that job is called like that by accident, propagating the content ofworkflows@v1
, fixed inv2
), that has a separate workflow, and it does not require GPU. So, it will keep building on GitHub, as it is doing right now :)
Indeed, previously I was a bit worried on the build
job that involved GPU packages 'cos that's where it failed in CI. And the check
job is mostly created to work around this. Thanks for the fix in v2
.
Here I have two alternatives:
@pytest.mark.cuda
, and filter them inconftest.py
according to the value of the environment variable, withinpytest_runtest_setup
f you want to run this very same workflow file on a self-hosted runner, then you would have to do something more complex: I don't believe you can access the underlying env vars by default, since for the GitHub runners there are the Variables for that, and it is written explicitly in the docs for the
env
contextHowever, you could run a first preliminary job, read the environment variable within bash code, and set in the
$GITHUB_ENV
file, this should make it available in theenv
context (I'm not completely sure myself, I should test it)Option 1. is much more reasonable to me, because it is more flexible (you could still run some tests everywhere), it's defined in Python, so you're already familiar with it (instead of looking around in GitHub docs), and you can test it offline on various systems, without having to fiddle with GitHub actions.
Originally posted by @alecandido in https://github.com/qiboteam/qibotn/pull/24#pullrequestreview-1869833153