Spico197 / DocEE

🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
https://doc-ee.readthedocs.io/
MIT License
232 stars 36 forks source link

分布式训练 #71

Closed WindSearcher closed 11 months ago

WindSearcher commented 11 months ago

您好,想请问下关于分布式训练ptpcg,如下图,在0和7卡上运行,但实际好像只在0卡上训练

image
WindSearcher commented 11 months ago

看到了,run_ptpcg.sh可以参考train_multi.sh中的分布式训练

WindSearcher commented 11 months ago

`# CUDA_VISIBLE_DEVICES=${GPUS} python -u run_dee_task.py \

CUDA_VISIBLE_DEVICES=${GPUS} python -m torch.distributed.launch --master_port=25662 --nproc_per_node ${NUM_GPUS} run_dee_task.py ` 把原先的注释加上新的这一行即可

Spico197 commented 11 months ago

最后还要加个 --parallel_decorate flag