Closed Winsome-A closed 1 year ago
Hi @Winsome-A I have never seen this error maybe you can provide extra information? what pytorch-lightning version are you using?
Oh yeah .Thanks for your email. I have upgraded my cuda&cudnn version and eventually , the problem went away.
❓ Questions and Help
When I train my COMET model, I have the following problem when I am almost successful, it seems to be stuck Here is my training command: CUDA_VISIBLE_DEVICES=0 comet-train --cfg /home/xusongcheng/COMET-master/configs/models/referenceless_model.yaml This is the last part on Xshell after my command and it's stuck here! LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
0 | encoder | XLMREncoder | 558 M 1 | layerwise_attention | LayerwiseAttention | 26 2 | train_metrics | RegressionMetrics | 0 3 | val_metrics | ModuleList | 0 4 | estimator | FeedForward | 10.5 M
10.5 M Trainable params 558 M Non-trainable params 569 M Total params 1,138.661 Total estimated model params size (MB) Sanity Checking: 0it [00:00, ?it/s]
I added “--num_workers 0” to the command is invalid How should I solve it? best wishes Winsome