Open jt551 opened 8 months ago
Hi, I got the same error as you when trying to run samples.ipynb and eval.py. Have you found a solution? Best regards
No solution, works on Paperspace with older P4000 GPU as Pascal architecture is supported by cuDNN 7.6.5 (CUDA 9). https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-824/support-matrix/#cudnn-versions-764-765
https://docs.nvidia.com/cuda/ada-compatibility-guide/ suggested to try running with
CUDA_FORCE_PTX_JIT=1
this produced the same error.
Hi, I got the sample notebook to work by running it on a newer version of CUDA (11.8) on my RTX 4070. I did this by first changing the docker file to:
FROM anibali/pytorch:2.0.1-cuda11.8-ubuntu22.04
# RUN sudo apt-get update
# RUN sudo apt-get upgrade -y
# RUN sudo apt-get install -y \
# build-essential
RUN sudo apt-get update \
&& sudo apt-get install -y libgl1-mesa-glx libgtk2.0-0 libsm6 libxext6 \
&& sudo rm -rf /var/lib/apt/lists/*
COPY requirements.txt /app/.
RUN pip install -r requirements.txt
And then changing the requirements.txt by removing the forced versions on all packages, adding opencv-python
, and removing mkl-fft
and mkl-random
.
This lead to the error ValueError: A colormap named "rooms_furu" is already registered.
in /floortrans/plotting.py which I fixed by changing line 610 in plotting.py to cmap3 = colors.ListedColormap(cpool, 'rooms_furu2')
.
I can now run the entirety of samples.ibynb without any errors, but I now get a different error when running eval.py.
$ python eval.py --weights model_best_val_loss_var.pkl
Traceback (most recent call last):
File "/app/eval.py", line 109, in <module>
evaluate(args, log_dir, writer, logger)
File "/app/eval.py", line 67, in evaluate
things = get_evaluation_tensors(val, model, split, logger, rotate=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/floortrans/metrics.py", line 176, in get_evaluation_tensors
predicted_classes = polygons_to_tensor(
^^^^^^^^^^^^^^^^^^^
File "/app/floortrans/metrics.py", line 127, in polygons_to_tensor
ten[pol_type['class'] + d][jj, ii] = 1
~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
IndexError: index 521 is out of bounds for axis 0 with size 521
Have you resolved this issue?thank you!
Hi, I got the sample notebook to work by running it on a newer version of CUDA (11.8) on my RTX 4070. I did this by first changing the docker file to:
FROM anibali/pytorch:2.0.1-cuda11.8-ubuntu22.04 # RUN sudo apt-get update # RUN sudo apt-get upgrade -y # RUN sudo apt-get install -y \ # build-essential RUN sudo apt-get update \ && sudo apt-get install -y libgl1-mesa-glx libgtk2.0-0 libsm6 libxext6 \ && sudo rm -rf /var/lib/apt/lists/* COPY requirements.txt /app/. RUN pip install -r requirements.txt
And then changing the requirements.txt by removing the forced versions on all packages, adding
opencv-python
, and removingmkl-fft
andmkl-random
.This lead to the error
ValueError: A colormap named "rooms_furu" is already registered.
in /floortrans/plotting.py which I fixed by changing line 610 in plotting.py tocmap3 = colors.ListedColormap(cpool, 'rooms_furu2')
.I can now run the entirety of samples.ibynb without any errors, but I now get a different error when running eval.py.
$ python eval.py --weights model_best_val_loss_var.pkl Traceback (most recent call last): File "/app/eval.py", line 109, in <module> evaluate(args, log_dir, writer, logger) File "/app/eval.py", line 67, in evaluate things = get_evaluation_tensors(val, model, split, logger, rotate=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/floortrans/metrics.py", line 176, in get_evaluation_tensors predicted_classes = polygons_to_tensor( ^^^^^^^^^^^^^^^^^^^ File "/app/floortrans/metrics.py", line 127, in polygons_to_tensor ten[pol_type['class'] + d][jj, ii] = 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^ IndexError: index 521 is out of bounds for axis 0 with size 521
Yes, I got it to work by changing 4 lines in floortrans/post_prosessing.py
.
From:
polygon[:, 0] = np.clip(polygon[:, 0], 0, max_width)
polygon[:, 1] = np.clip(polygon[:, 1], 0, max_height)
To:
polygon[:, 0] = np.clip(polygon[:, 0], 0, max_width-1)
polygon[:, 1] = np.clip(polygon[:, 1], 0, max_height-1)
And I did this change in two places. The first one around line 925 and the second around line 981. Hope this helps!
Hello,
I'm trying to run the sample notebook on a new laptop with Ubuntu 20.04, RTX2000 GPU, and nvidia-driver-535.
When trying to execute following section in samples.ipynb
Networks prediction for the segmentation
I get following error in the notebook immediately with model():
Terminal running docker shows:
Could I get help to resolve this issue,
Thank you!