Closed innocence0206 closed 8 months ago
你好同学,请问一下这个3090是不是完全跑不了?
我有相同的疑惑,以及如果3090不可行,4090是否可行
3090可以跑,但需要完整的一张卡
感谢告知,请问如果使用3090训练时长大概是多久呢
我跑 UMambaBot abdomen CT 3D 大约115s一个epoch
感谢您的回复
您好,不好意思打扰您,我跑 UMambaBot abdomen CT 3D 时出现报错Traceback (most recent call last):
File "/root/miniconda3/envs/umamba/bin/nnUNetv2_train", line 33, in
Invoked with: tensor([[[-0.4309, 0.0259, -0.4727, ..., -0.2781, -0.1232, -0.3730], [-0.2781, -0.0997, -0.6050, ..., -0.2766, -0.6372, -0.3533], [-0.0618, 0.8125, -0.4060, ..., -0.1714, -0.4846, -0.0590], ..., [ 0.0829, 0.2800, 0.4446, ..., -0.7305, -0.3291, -0.6758], [-0.1943, -0.0635, 1.0850, ..., 1.4658, 1.4893, 1.3594], [ 0.2766, 1.0732, 0.0393, ..., -0.2764, -0.6304, -0.8667]],
[[ 0.2411, 0.0118, 0.5518, ..., 0.3296, -0.8071, -0.1635],
[ 0.6743, 0.1592, -0.1962, ..., 0.0081, -0.4353, -0.1405],
[-0.0917, 0.7886, 0.0710, ..., -0.1394, -0.9678, -0.1375],
...,
[-0.1704, -0.1528, 0.7148, ..., 0.0450, -0.7852, -0.8745],
[ 0.6436, 0.6372, 0.4551, ..., 1.4600, 0.9419, 0.5239],
[ 0.3149, 0.4114, 0.3367, ..., -0.3435, -0.5747, -0.7856]]],
device='cuda:0', dtype=torch.float16, requires_grad=True), tensor([[ 0.3824, 0.2066, -0.3590, 0.0857],
[ 0.3109, -0.0909, -0.4797, 0.1386],
[ 0.4349, -0.1634, 0.3887, 0.0434],
...,
[ 0.4660, 0.0241, 0.1350, -0.2433],
[-0.0888, 0.3035, 0.0764, 0.1301],
[ 0.2705, -0.2486, 0.4838, -0.0985]], device='cuda:0',
requires_grad=True), Parameter containing:
tensor([ 4.7962e-01, 4.8479e-01, -2.5279e-01, -3.0827e-01, -3.3213e-01, -1.4288e-01, -4.5987e-01, 4.3878e-01, -1.5063e-03, -1.9597e-01, 1.4319e-01, 2.6169e-01, -1.7347e-01, 7.7068e-02, -4.0516e-01, 4.6599e-01, 1.9281e-02, 1.4166e-01, 1.8710e-01, 2.7384e-01, 2.0033e-01, -4.8457e-01, -1.1843e-01, -3.6279e-01, -7.8901e-02, 3.4482e-02, 3.8873e-01, -2.0474e-01, -1.2400e-02, -1.1579e-01, -1.9957e-01, 9.5242e-02, 1.5356e-01, -3.4117e-01, 3.6400e-01, -2.0204e-02, 2.1317e-01, 2.3487e-01, 2.0242e-01, -1.1255e-01, -3.1767e-01, 6.2158e-02, 2.6787e-01, -1.7242e-01, 3.2281e-02, 3.5580e-02, 2.6998e-01, -4.0724e-02, 3.7145e-01, -2.1433e-01, 4.9845e-02, 2.2799e-01, -3.9639e-01, -1.9488e-01, -3.5587e-01, -3.4614e-01, 1.4036e-01, -1.3031e-01, -2.2870e-01, 4.8570e-01, 3.2376e-01, -3.4532e-01, 4.0973e-01, -2.6478e-01, -2.3716e-01, -1.8571e-01, -2.4732e-01, -2.4669e-01, -4.3735e-01, -3.3737e-01, 3.6823e-01, 4.2479e-01, 4.8993e-01, 4.3231e-01, 4.0503e-01, 1.0974e-02, 3.5211e-01, -3.9347e-02, -3.9272e-01, 3.9859e-01, -4.3879e-01, 1.1274e-01, -2.4708e-01, -2.1775e-01, -3.1858e-01, 4.7040e-01, -3.0428e-01, -3.6425e-01, -4.0608e-01, -4.0493e-01, -3.6670e-01, -1.6068e-01, -1.5086e-01, -3.9675e-01, 2.0375e-01, -3.9259e-01, -2.2690e-01, 8.0192e-02, -5.3728e-02, 3.8359e-02, -3.0218e-01, 2.5141e-02, -2.6222e-01, -3.4578e-01, 4.1703e-01, 3.4669e-01, 1.4030e-01, -3.4173e-01, -2.1287e-01, -3.0722e-01, 2.6901e-01, 4.3156e-01, 1.9677e-01, -2.3181e-02, -4.6300e-01, 2.7719e-01, 1.0645e-01, -2.7175e-01, -1.9727e-01, -3.6466e-01, 3.8228e-01, -3.0307e-01, 1.9912e-01, 2.8042e-01, -3.2982e-01, -4.9260e-01, -2.6728e-01, 4.7245e-01, -4.2860e-02, -3.8904e-01, 8.3890e-02, 3.3922e-01, 1.4378e-01, -3.8792e-01, -1.6257e-01, 5.2445e-03, 9.3250e-02, 4.6449e-01, -4.8274e-01, -3.9466e-01, 8.8912e-02, -4.8302e-01, 2.4707e-01, 4.1222e-01, -2.5830e-01, -4.0399e-01, -8.6289e-02, 4.0012e-01, 1.5051e-01, 4.9645e-01, 3.9160e-01, 5.4804e-02, 3.0695e-01, 1.4465e-01, -1.0898e-01, 5.8964e-02, 1.1723e-01, 4.3697e-01, 1.4168e-01, 3.4829e-01, -4.0456e-01, 3.4746e-01, 2.5495e-01, -4.4825e-01, 5.0101e-02, 3.5747e-01, 4.5524e-01, -2.5208e-01, -7.0708e-02, 2.1429e-01, 1.1511e-01, 2.6056e-01, 7.9405e-02, -1.2963e-01, -2.6105e-01, 3.3566e-01, -4.5001e-02, -4.4204e-01, -1.7358e-01, -4.6689e-02, -2.9253e-01, 2.6490e-01, 1.2748e-01, -4.2641e-01, -1.9272e-01, -6.0015e-02, 4.0870e-01, 8.2099e-02, -4.8710e-01, 1.7549e-01, 9.5151e-02, -3.2689e-01, -3.0086e-01, 3.3018e-01, -3.6508e-01, -6.1753e-03, 9.9023e-02, 1.8249e-01, -4.6626e-02, 1.1730e-01, -4.1660e-01, -4.2668e-02, -1.6562e-01, -4.0187e-01, 3.1341e-02, -4.3040e-01, 3.1844e-01, -3.7103e-01, 4.3835e-01, 1.3132e-01, 2.5113e-01, -1.5656e-01, 3.0113e-01, -1.3201e-01, 3.7793e-01, -1.1079e-01, -2.6707e-02, -1.7200e-02, -4.5116e-01, -3.4246e-01, -2.0996e-01, -3.5461e-01, -3.7659e-02, 2.5668e-01, -2.9453e-01, -3.7982e-01, -1.1473e-01, 3.5894e-01, 1.7152e-01, 3.7714e-01, -5.3735e-02, -1.7037e-01, 4.9556e-01, 1.7471e-01, -4.9671e-01, 2.4128e-01, -3.4676e-02, -2.1347e-01, 1.2655e-01, -1.8278e-01, 4.9826e-01, 5.6089e-02, 1.7481e-01, -2.4452e-01, 2.6694e-02, -6.3505e-02, 6.5011e-02, 1.5366e-01, 2.2215e-01, -4.8083e-01, 1.1660e-01, 1.5295e-01, 4.4092e-01, -3.0475e-01, -4.9312e-01, 2.0703e-01, -3.9867e-02, -1.5779e-01, -1.6845e-01, -9.1024e-02, -3.9765e-01, -8.7451e-02, -2.9990e-01, -1.1314e-01, 1.2752e-01, -2.3455e-01, 2.5247e-01, 3.9602e-01, 4.8431e-01, 4.9729e-01, -2.2461e-02, 1.1501e-01, -4.1873e-01, -8.7960e-03, 3.3952e-01, 3.3574e-01, 1.1880e-01, 1.3549e-01, 2.9769e-01, -1.7123e-03, -2.4086e-01, -1.5897e-01, -1.6415e-01, -1.4279e-01, 1.0480e-02, -3.3306e-01, 4.7584e-01, 1.3665e-02, 1.4578e-02, 3.9674e-01, 1.3054e-01, 1.6986e-01, 2.8605e-01, -4.9228e-01, -7.0325e-02, 2.9715e-01, 9.7585e-02, -1.6786e-02, 8.3297e-02, -3.7793e-01, 3.0135e-01, 8.2653e-02, -3.7482e-01, 1.3600e-01, 1.0151e-01, -1.8526e-01, -1.9975e-01, -3.9446e-01, -4.6085e-01, -3.0774e-04, -2.5961e-01, -3.6751e-01, 1.5648e-01, 4.7004e-01, -4.3699e-01, 9.7898e-02, 2.1264e-03, 2.8668e-01, -1.9282e-01, 4.5377e-01, -1.4316e-01, 4.3265e-01, -1.9558e-02, -3.5399e-01, -1.2024e-01, 1.2218e-01, 3.2912e-01, 3.8517e-01, -4.0553e-01, -4.4340e-01, -4.3079e-01, 4.6752e-01, 4.1436e-01, 3.0286e-01, 1.2877e-02, 3.8312e-01, 9.1779e-02, 2.0205e-01, 2.6643e-01, -3.5545e-01, -3.1599e-01, -9.8497e-02, 4.6730e-01, -4.2203e-01, 8.2979e-02, 2.2627e-01, 2.3198e-01, -5.8854e-02, 4.6601e-01, 2.4186e-01, 7.4165e-02, 2.4635e-01, -4.3503e-01, -3.7626e-01, -3.0744e-01, 5.9249e-02, 4.4304e-01, 2.9486e-01, -1.7534e-01, -3.9102e-01, -3.3671e-02, -5.2782e-02, 3.5540e-01, -2.7880e-01, -3.3705e-01, -1.9167e-01, -4.7580e-01, -3.0708e-02, 4.7162e-01, -4.2262e-02, 1.5208e-01, 1.0470e-01, 4.7486e-01, 3.5389e-01, 2.6363e-01, -1.4774e-01, -2.8184e-01, 4.7354e-01, -1.3232e-01, -1.8386e-01, 1.5762e-01, -2.0357e-01, -9.0662e-02, -2.3358e-01, 1.7566e-01, -4.8818e-01, 1.6877e-01, 1.2035e-01, -3.7156e-01, 4.5751e-02, -3.0792e-02, -3.5186e-01, 4.1885e-02, 7.8687e-02, 3.6551e-02, -2.7650e-01, -1.3466e-01, -4.9134e-01, 1.1234e-01, 1.5127e-02, -3.1435e-01, 1.8003e-02, 4.4308e-01, 1.4189e-01, 3.5282e-01, -3.3235e-01, 4.7515e-01, -9.5606e-02, 2.3553e-01, -1.1071e-01, -4.0396e-01, -3.5232e-01, -2.4940e-02, 4.7493e-01, 1.7924e-01, 1.0880e-01, 3.0362e-02, 3.0638e-01, -6.0475e-02, 4.9317e-01, 1.1514e-01, 1.4032e-01, -1.7229e-01, -3.6092e-01, -2.7938e-01, 4.6423e-01, -4.7306e-01, 2.3984e-01, 4.5483e-01, -3.1906e-01, -4.8101e-01, -3.3243e-01, -5.6282e-02, 4.8489e-01, -2.4417e-01, -5.1842e-02, 3.4968e-01, 3.8380e-01, -3.8495e-01, 2.3101e-01, -1.5159e-01, 4.7766e-01, -4.2653e-01, 5.3804e-02, -3.9535e-01, -5.4005e-02, -6.5014e-02, -3.5616e-01, -3.9728e-01, 6.8200e-04, 3.3623e-01, -3.2197e-01, 1.9382e-01, 3.0068e-01, 3.3718e-01, -6.2512e-02, 3.0143e-01, 4.9067e-01, 2.7393e-01, -5.3442e-02, -7.4704e-02, 1.9932e-02, -4.8002e-01, -1.3367e-01, 1.6941e-01, 2.8051e-01, -3.8234e-01, -3.3470e-01, -3.8447e-01, -1.3884e-01, -4.0276e-01, -5.2763e-02, -9.0117e-03, 2.6956e-01, -4.8159e-02, 2.9178e-01, 4.2928e-01, -3.8596e-01, -2.3522e-01, 6.4796e-02, 4.8996e-01, -3.1418e-01, 1.1836e-01, -2.5484e-01, -2.8595e-01, -8.2885e-02, 4.4646e-01, 2.5157e-01, 4.8594e-01, 3.0306e-01, -4.4475e-01, -1.4232e-01, 3.9684e-01, 3.9130e-01, 2.0070e-01, 4.2089e-01, -1.3706e-01, -7.7699e-02, 9.2871e-02, -2.2580e-01, 4.8880e-02, -2.0443e-01, 2.4460e-01, 3.1691e-01, -3.4573e-01, -3.1402e-01, 2.4022e-01, -9.7548e-02, -1.0715e-01, -3.9855e-01, -3.5270e-01, -8.2221e-02, -1.2195e-01, 1.8677e-01, 3.6652e-01, 1.0880e-02, -1.3555e-01, 2.7501e-01, 5.7921e-02, 4.6345e-01, 4.1097e-01, 3.2102e-01, 6.9696e-02, -1.4442e-01, -2.3706e-01, -1.1486e-01, 2.6529e-01, -1.1724e-01, -1.5342e-01, 3.8587e-02, -3.7428e-01, 8.4994e-02, -2.8722e-01, -8.7962e-02, -2.3878e-01, -1.3052e-01, -4.7879e-01, 9.0405e-02, -3.9515e-01, -1.6687e-01, -4.2679e-01, 1.9127e-02, -1.7729e-01, -6.1946e-02, 6.4120e-03, 1.2182e-01, -1.6453e-01, -3.8485e-01, -2.5691e-01, -4.6639e-01, 4.0793e-02, -3.4684e-01, 1.2126e-01, -3.4125e-01, -2.7934e-01, 4.7662e-01, -1.3444e-01, 3.3714e-02, -4.6432e-01, -4.9636e-01, -1.5673e-01, -2.3140e-01, -4.6284e-01, 1.1926e-01, -1.0567e-01, -3.9161e-01, -3.6246e-01, -3.6285e-01, 2.7917e-01, -4.2190e-01, 3.5815e-01, -1.3636e-01, 2.5614e-01, 4.4768e-01, 1.7985e-01, 2.0618e-01, -2.4978e-01, 2.0866e-01, 1.5377e-01, -1.5427e-01, 1.5791e-01, -3.7525e-01, -1.1543e-01, -3.5070e-01, -4.6737e-01, 3.3200e-01, 4.4412e-01, 3.3342e-01, -2.8459e-01, 3.9553e-01, 6.4688e-02, -2.9085e-03, -4.3364e-01, -3.4963e-01, 1.2520e-01, 4.3094e-01, 3.0056e-01, -4.8826e-01, 3.0677e-01, -1.0584e-01, -4.9895e-01, -2.2566e-01, -1.3947e-01, 4.2535e-01, -2.1361e-01, 1.4325e-02, 3.1503e-01, -2.5911e-01, 1.3588e-01, -2.7181e-01, -9.1229e-02, -3.8378e-01, -4.9888e-01, 1.1255e-01, 4.6404e-01, 4.7393e-01, -4.6205e-01, -3.5867e-01, 2.4632e-01, 4.0868e-01, -2.8216e-01, 2.1556e-01, 1.4775e-01, -3.2551e-01, 4.8231e-01, -1.4681e-01, -2.5381e-01, -4.7905e-01, 1.4209e-01, -2.9922e-01, 2.5331e-01, 1.4627e-01, -2.0744e-01, 3.1157e-01, -2.3434e-01, -3.9565e-01, 1.8040e-01, -1.6962e-02, 3.8975e-01, -4.2561e-01], device='cuda:0', requires_grad=True), None, None, None, True Exception in thread Thread-4 (results_loop): Traceback (most recent call last): File "/root/miniconda3/envs/umamba/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/root/miniconda3/envs/umamba/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/root/miniconda3/envs/umamba/lib/python3.10/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop raise e File "/root/miniconda3/envs/umamba/lib/python3.10/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the " RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message 请问您训练的时候有出现这个问题吗,如果出现的话可否请教您如何解决的这个问题
应该是causal_conv1d版本的问题,你可以确认安装的是否为1.1.0版本,如果还是不行,就去掉causal_conv1d_cuda.causal_conv1d_fwd( )这里面的倒数第二个参数None
感谢您的回复,训练已经没有问题了,但是测试出了问题,想请问您测试时使用的命令行是什么
感谢您的回复,训练已经没有问题了,但是测试出了问题,想请问您测试时使用的命令行是什么
和官方给的一样
对,就是命令行中的INPUT_FOLDER,想请教您使用的路径是什么
我跑 UMambaBot abdomen CT 3D 大约115s一个epoch
你好,请问你能跑UMambaEnc abdomen CT 3D吗,这个对显卡有什么要求吗,我使用24G的NVIDIA Geforce RTX 4090报错CUDA out of memory
我跑 UMambaBot abdomen CT 3D 大约115s一个epoch
你好,请问你能跑UMambaEnc abdomen CT 3D吗,这个对显卡有什么要求吗,我使用24G的NVIDIA Geforce RTX 4090报错CUDA out of memory
能跑Enc CT 3D,需要占用近24G的显存
我跑 UMambaBot abdomen CT 3D 大约115s一个epoch
你好,请问你能跑UMambaEnc abdomen CT 3D吗,这个对显卡有什么要求吗,我使用24G的NVIDIA Geforce RTX 4090报错CUDA out of memory
能跑Enc CT 3D,需要占用近24G的显存
感谢告知
Hello!Your work is great! I want to know that is a 24G NVIDIA Geforce RTX 3090 GPU enough to run all the experiments? I encounter the OOM problem.