ParaGen is a PyTorch deep learning framework for parallel sequence generation. Apart from sequence generation, ParaGen also enhances various NLP tasks, including sequence-level classification, extraction and generation.
# Ubuntu
apt-get install libopenmpi-dev libssl-dev openssh-server
# CentOS
yum install openmpi openssl openssh-server
# Conda
conda install -c conda-forge mpi4py
cd ParaGen
pip install -e .
ParaGen
with torch.distributed
python -m torch.distributed.launch --nproc_per_node {GPU_NUM} paragen/entries/run.py --configs {config_file}
You can also use horovod
for distributed training. Install horovod
with
# require CMake to install horovod. (https://cmake.org/install/)
HOROVOD_WITH_PYTORCH=1 HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_NCCL_HOME=${NCCL_ROOT_DIR} pip install horovod
Then run ParaGen
with horovod
:
horovodrun -np {GPU_NUM} -H localhost:{GPU_NUM} paragen-run --config {config_file}
pip install lightseq
Before using ParaGen
, it would be helpful to overview how ParaGen
works.
ParaGen
is designed as a task-oriented
framework, where task
is regarded as the core of all the codes.
A specific task selects all the components for support itself, such as model architectures, training strategies, dataset, and data processing.
Any component within ParaGen
can be customized, while the existing modules and methods are used as a plug-in library.
As tasks are considered as the core of ParaGen
, it works with various modes
, such as train
, evaluate
, preprocess
and serve
.
Tasks act differently under different modes, by reorganizing the components without code modification.
Please refer to examples for detailed instructions.
We welcome any experimental algorithms on ParaGen.
third_party
;