apoorvkh / torchrunx

Automatically initialize distributed PyTorch environments
https://torchrunx.readthedocs.io
MIT License
1 stars 0 forks source link

torchrunx 🔥

PyPI - Python Version PyPI - Version Tests Docs GitHub License

Automatically launch functions and initialize distributed PyTorch environments on multiple machines

Installation

pip install torchrunx

Requirements:

Usage

# Simple example
def distributed_function():
    pass
import torchrunx as trx

trx.launch(
    func=distributed_function,
    func_kwargs={},
    hostnames=["node1", "node2"],  # or just: ["localhost"]
    workers_per_host=2
)

In a SLURM allocation

trx.launch(
    # ...
    hostnames=trx.slurm_hosts(),
    workers_per_host=trx.slurm_workers()
)

Compared to other tools

Contributing

We use the pixi package manager. Simply install pixi and run pixi shell in this repository. We use ruff for linting and formatting, pyright for static type checking, and pytest for testing. We build for PyPI and conda-forge. Our release pipeline is powered by Github Actions.