Closed Yan1026 closed 2 years ago
It doesn't need to be fixed. Wandb sometimes encounters network traffic issues, but it will be reconnected automatically and will not affect the training.
You can run w/o wandb, just simply set it to be "offline".
Sorry to bother you,but I run ‘wandb offline’ then get ‘W&B offline, running your script from this directory will only write metadata locally.’
I set it to be ‘offline’,but out file still show: 0%| | 0/289 [00:00<?, ?it/s]wandb: Network error (ConnectionError), entering retry loop.
That is so wired, I didn't meet this issue before.
Normally, wandb will reconnect to the network automatically, and the training will not be effected. May I ask that whether your training is interrupted by this issue or not?
Also, please add
os.environ['WANDB_START_METHOD'] = 'fork'
on top of this line.
If it still not works, please comment all the wandb related functions in both "main.py" and "train.py".
Cheers, Yuyuan
Thank you for reply,I can't use wandb because the server can't connect to the Internet when it's training.
It is strange that the out-file stop update but process is still running.(I run 'nvidia-smi',the process is still running)
Thank you for your advice!
Happy to help!
I run with ./scripts/train_voc_aug.sh -l 1323 -g 4 -b 101 but get error:
ID 3 Warm (4) | Ls 0.51 |: 98%|█████████████████████████████████████████████████████████████████████▎ | 40/41 [01:04<00:00, 1.18it/s] ID 3 Warm (4) | Ls 0.51 |: 98%|█████████████████████████████████████████████████████████████████████▎ | 40/41 [01:14<00:00, 1.18it/s] ID 3 Warm (4) | Ls 0.51 |: 100%|███████████████████████████████████████████████████████████████████████| 41/41 [01:14<00:00, 3.65s/it] ID 3 Warm (4) | Ls 0.51 |: 100%|███████████████████████████████████████████████████████████████████████| 41/41 [01:14<00:00, 1.81s/it]
0%| | 0/289 [00:00<?, ?it/s]wandb: Network error (ConnectionError), entering retry loop.
How can I solve this problem?Can I run without wandb?