Ahmedest61 / MultiRes-NetVLAD

49 stars 4 forks source link

error when train #2

Open xy-git opened 2 years ago

xy-git commented 2 years ago

HI,thanks for your work! when i train as the guide. i run python main.py --mode=cluster --arch=vgg16-12 --pooling=netvlad --density_L=10 --num_clusters=64 it's ok and then i run python main.py --mode=train --arch=vgg16-12 --pooling=netvlad --density_L=10 --num_clusters=64 find the error Traceback (most recent call last): File "/home/Desktop/Link to projects/MultiRes-NetVLAD-main/main.py", line 594, in <module> raise ValueError('pca_layer can be used only during Testing') ValueError: pca_layer can be used only during Testing can u help me? thanks o lot

Ahmedest61 commented 2 years ago

Hi, Thank you for pointing this out. I have removed the bug. Can you rerun it?

xy-git commented 2 years ago

Yes,it works now. But when i try to train it on 4GPU after python main.py --mode=cluster --arch=vgg16-12 --pooling=netvlad --density_L=10 --num_clusters=64 I run python main.py --mode=train --arch=vgg16-12 --pooling=netvlad --density_L=10 --num_clusters=64 --nGPU 4 and change the cacheBatchSize as 24,48 or other number(default=12),it all failed RuntimeError: CUDA error: the launch timed out and was terminated CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. while my GPU usd is only 15%,why? i think maybe 1451(num of pic) can't divisible by 2? can u help me ? thanks