Open sarahyurick opened 1 month ago
Note from Oliver about 2: " This file didn’t work because using secrets.GITHUB_TOKEN is not permitted to trigger other workflows. GH disallows that to prevent spam. You would need to use your personal access token (PAT). However, since we’re in a public environment, you would need to limit that to a protected branch (via environments). It’s still doable (from branch main, react to all issue change events), but a bit more tricky. As long as the PAT is not exposed to all branches, this is fine. "
Additional tasks outside of GitHub:
NeMo files of interest:
Suggestion from @praateekmahajan: parametrizing the GPU tests with commonly used clients.
get_client(cluster_type="gpu", protocol="ucx")
get_client(cluster_type="gpu", protocol="tcp")
set_torch_to_use_rmm
/enable_spilling
(we shouldn't try all the permutations, just the ones where we expect different behavior)Depending on how many GPUs the node has we could we even try a multi-GPU setup.
Regarding the gpuCI
label:
/okay to test
comments
There are a couple of GitHub Actions I want to add to NeMo Curator:
gpuci
label if the PR is (1) created by a user with write access and (2) modifying at least 1 Python fileAfter the first one is merged, I can open separate PRs to add the others.