Open petergerten opened 3 years ago
Trying to train on 1 GPU I get stuck here:
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/shape_base.py", line 577, in expand_dims
if axis > a.ndim or axis < -a.ndim - 1:
TypeError: '>' not supported between instances of 'list' and 'int'
Hi, Thank you for the interest in the work! I have couple deadlines over the next days so will definitely try to get back to you by the end of the week!
How many GPUs are required to train at least ?
Hi, most sincere apologies for not getting back to it earlier! The model can be trained by even a single GPU.
On which line of the code did you get the error? Did you make changes by any chance in the implementation? The error seems to potentially indicate some small bug so further information could be helpful.
Couple more points:
--batch-gpu
with a lower value, like e.g. 1 to fit the model training into the GPUI hope one of these might resolve the issue!
I always get out of memory errors even when using all defaults and training low resolution. 8 * V100 16GB