This PR enables multi-GPU training, as well as add auto-initialization of a Model.
It also introduces singlegpu and multigpu pytest markers for splitting the GPU CI Github Actions workflow into two jobs: one for the 1GPU runner, and one for multi-gpu 2GPU runner.
Follow-up: The test in tests/integration is not complete because Lightning launches separte processes under the hood with the correct environment variables like LOCAL_RANK, but the pytest stays in the main process and tests only the LOCAL_RANK=0 case. To follow up with proper test that ensures dataloader is working properly with e.g., global_rank > 0.
This PR enables multi-GPU training, as well as add auto-initialization of a Model.
It also introduces
singlegpu
andmultigpu
pytest markers for splitting the GPU CI Github Actions workflow into two jobs: one for the1GPU
runner, and one for multi-gpu2GPU
runner.Follow-up: The test in
tests/integration
is not complete because Lightning launches separte processes under the hood with the correct environment variables likeLOCAL_RANK
, but the pytest stays in the main process and tests only the LOCAL_RANK=0 case. To follow up with proper test that ensures dataloader is working properly with e.g., global_rank > 0.