Closed marrobHD closed 3 years ago
I got the same problem. I fixed it by installing an older version of PyTorch:
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
Maybe you have to adjust the CUDA version of the python packages, but this fixed the issue for me!
Thank you very much! The first part worked flawlessly, but after the first epoch the following errors occurred:
Epoch gpu_mem box obj cls total targets img_size
298/299 9.25G 0.04881 0.01046 0.003113 0.06239 14 640: 100% 2/2 [00:01<00:00, 1.79it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100% 1/1 [00:00<00:00, 4.11it/s]
all 16 0 0 0 0 0
Epoch gpu_mem box obj cls total targets img_size
299/299 9.25G 0.05258 0.01181 0.00305 0.06744 19 640: 100% 2/2 [00:00<00:00, 2.12it/s]
Class Images Targets P R mAP@.5 mAP@.5:.95: 100% 1/1 [00:00<00:00, 1.34it/s]
all 16 0 0 0 0 0
/usr/local/lib/python3.7/dist-packages/seaborn/matrix.py:194: RuntimeWarning: All-NaN slice encountered
vmin = np.nanmin(calc_data)
/usr/local/lib/python3.7/dist-packages/seaborn/matrix.py:199: RuntimeWarning: All-NaN slice encountered
vmax = np.nanmax(calc_data)
Exception in thread Thread-614:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/usr/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/content/deepstack-trainer/utils/plots.py", line 122, in plot_images
colors = color_list() # list of colors
File "/content/deepstack-trainer/utils/plots.py", line 32, in color_list
return [hex2rgb(h) for h in plt.rcParams['axes.prop_cycle'].by_key()['color']]
File "/content/deepstack-trainer/utils/plots.py", line 32, in <listcomp>
return [hex2rgb(h) for h in plt.rcParams['axes.prop_cycle'].by_key()['color']]
File "/content/deepstack-trainer/utils/plots.py", line 30, in hex2rgb
return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
File "/content/deepstack-trainer/utils/plots.py", line 30, in <genexpr>
return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
TypeError: int() can't convert non-string with explicit base
Optimizer stripped from train-runs/dataset/exp/weights/last.pt, 43.4MB
Optimizer stripped from train-runs/dataset/exp/weights/best.pt, 43.4MB
300 epochs completed in 0.711 hours.
Glad to hear it worked! What start parameters did you use? Because with the standard parameter for epochs you will train 300 epochs - which you did according to the first two lines of your code snippet:
Epoch gpu_mem box obj cls total targets img_size 298/299 9.25G 0.04881 0.01046 0.003113 0.06239 14 640: 100% 2/2 [00:01<00:00, 1.79it/s]
Here the full log: https://0bin.net/paste/42IzMhmE#FhlOBA6PMJhWlSeYx53Drp6IIQlN9fLU-wTkOf8/Gxs Ive cut of the stuff before 298 out there. Deepstack still wont detect my trained logos and objects. I dont know if its exception in thread Thread-614 fault.
Reading lines 47 and 48 of your attached log file, I notice that deep-stack does not recognize labels here. You need a label file here in the training set as well as in the validation set for each image. Also there must be a 'classes.txt' in each of the two folders.
To shorten the training time (until everything runs smoothly) you can call the deep-stack-trainer with the following start parameters (per default you will run for 300 epochs):
python3 train.py --dataset-path "/content/dataset" --epochs 30
Thank you, everything works now as intended. I forgot to include the .txt files in the test directory. I'll close this now
Is there any way to resolve this error and use Cuda 11.1+? Pytorch 1.7.1 is not compatible with the Ampere GPUs.
I have the same issue, any way to use this with the latest cuda and pytorch?
I used your google colab. Command:
!python3 train.py --dataset-path "/content/test" --model "yolov5x"
Error: