Closed taerimyeon closed 4 years ago
Hi @taerimyeon
How long is it stuck for? Usually it takes a little while to load the model weights to the GPU.
In addition, can you try running your image in the Colab notebook and see if it works over there?
Hello @royorel , thanks for your quick reply! I have tried the Colab notebook and it works well. It seems that I used wrong Pytorch version because my GPU driver supports CUDA 11.0 or 11.1 (which are optimized for Pytorch 1.7.0), but I used Pytorch 1.4.0 (the same version as this repo) with CUDA 10.1. Your code executes fine but very slow. I didn't remember the exact time, rebuilding the network takes more than 30 minutes, and inference on single image takes even longer, so maybe a couple of hours? Later I will try to setup the project on my Ubuntu PC and I will let you know the result!
That shouldn't take too long. The entire process should take no longer than 1-2 minutes, including loading all the weights to the GPU. We have tried to run the project on windows too (with RTX2080 and CUDA 10.2) and it didn't take so long to run.
@ohadf do you have any idea on what can go wrong with regards to a windows machine?
I have no leads.
@taerimyeon perhaps you can try to pinpoint the exact step that is taking so long. Also, is your setup working in general (unrelated to this project)? I'm asking because your GPU is new (1 month?) so I thought that maybe this is the first time you're trying to run pytorch on your machine. In that case, maybe double check using a different pytorch-based project --- eliminate the option that it is a broader issue.
Hi everyone @royorel @ohadf sorry for late reply! It took me a while to set up the Linux system. I have tried the test code using different PyTorch version and it works well, single image inference (plus network weights loading) is blazing fast! All process took about 2 minutes in total. The following is my setup:
Ubuntu 18.04.5 x86/x64
CUDA 11.1.0 (system)
cuDNN 8.0.4 (system)
Pytorch 1.7.0 (installed with pip, CUDA ver 11.0)
Torchvision 0.8.1 (installed with pip, CUDA ver 11.0)
#All other dependencies followed requirements.txt
The reason I used CUDA 11.1.0 for the system is because installation always failed when I install lower CUDA version, even with CUDA 11.0. Before I install this Pytorch version even moving a small, single tensor to GPU memory takes more than 15 minutes. Maybe I can also try the setup for Windows with higher CUDA version. Thanks for all the help, now I will close this issue.
Hello, thank you for your good work. I am trying to test your code using my own image (single image) but it is always stuck at
dataset [MulticlassUnalignedDataset] was created
. Could you please help me find out the issue? I ran the test code usingpython test.py
instead through batch file because it has issue to access my library. My system configurations are:requirements.txt
. Thank you very much.