When I ran Codellama-34b with two A100 cards, I was able to run it for the first time at first, but when I ran again later, I reported an error:ConnectionError: Tried to launch distributed communication on port 29500, but another process is utilizing it. Please specify a different port (such as using the ----main_process_port flag or specifying a different main_process_port in your config file) and rerun your script. To automatically use the next open port (on a single node), you can set this to 0.
When I ran Codellama-34b with two A100 cards, I was able to run it for the first time at first, but when I ran again later, I reported an error:ConnectionError: Tried to launch distributed communication on port
29500
, but another process is utilizing it. Please specify a different port (such as using the----main_process_port
flag or specifying a differentmain_process_port
in your config file) and rerun your script. To automatically use the next open port (on a single node), you can set this to0
.