This pull request introduces a Docker-based environment setup for this CUDA-utilizing application, leveraging the power of Docker Compose to streamline development and execution on GPU-enhanced systems. By integrating a Dockerfile based on the CUDA 12.1 image and a corresponding Docker Compose configuration, we enable effortless building and running of the application in a GPU-accelerated environment. With this setup, executing the code in a predefined environment is as simple as running:
docker compose up
Add Dockerfile based on CUDA 12.1 image
Add compose config to build and run on GPU easily
Add pyproject.toml with poetry dependencies
Add poetry.lock with locked and verified dependency versions
Add ignore files with common setup for Python repos
Add debug logging and error catching to the torchrun main script
@tomas-gajarsky Thanks for providing the docker environment. Since we are still actively modifying the core training code, we are not planning docker wrap so far.
This pull request introduces a Docker-based environment setup for this CUDA-utilizing application, leveraging the power of Docker Compose to streamline development and execution on GPU-enhanced systems. By integrating a Dockerfile based on the CUDA 12.1 image and a corresponding Docker Compose configuration, we enable effortless building and running of the application in a GPU-accelerated environment. With this setup, executing the code in a predefined environment is as simple as running:
docker compose up