I run ./run_docker.sh firstly and run cola.sh base in experiments/glue/, but get an error like that:
/usr/bin/python: Error while finding module specification for 'DeBERTa.apps.run' (ModuleNotFoundError: No module named 'DeBERTa')
then I install DeBERTa with pip install DeBERTa , and still get an error:
07/30/2022 04:35:25|ERROR|CoLA|00| Uncatched exception happened during execution.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/DeBERTa/apps/run.py", line 389, in <module>
main(args)
File "/usr/local/lib/python3.6/dist-packages/DeBERTa/apps/run.py", line 248, in main
device = initialize_distributed(args)
File "/usr/local/lib/python3.6/dist-packages/LASER/training/dist_launcher.py", line 110, in initialize_distributed
return _setup_distributed_group(args)
File "/usr/local/lib/python3.6/dist-packages/LASER/training/dist_launcher.py", line 64, in _setup_distributed_group
init_method=init_method)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/distributed_c10d.py", line 455, in init_process_group
barrier()
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/distributed_c10d.py", line 1960, in barrier
work = _default_pg.barrier()
RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:784, unhandled system error, NCCL version 2.7.8
I run
./run_docker.sh
firstly and runcola.sh base
in experiments/glue/, but get an error like that:then I install DeBERTa with
pip install DeBERTa
, and still get an error: