rvorias / ind_knn_ad

Vanilla torch and timm industrial knn-based anomaly detection for images.
https://share.streamlit.io/rvorias/ind_knn_ad
MIT License
147 stars 50 forks source link

Woking with GPU #29

Closed Emyyr closed 6 months ago

Emyyr commented 1 year ago

I tried models work on gpu but torch is not working. Program gives a simle GPU errors.

Like (Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor).

I put my veriables(input and model) to cuda on gpu with ".to(device)" or ".cuda()". It's still not working. But if all veriables and model were staying "CPU" code is working. How can I solve this problem? Feature Extractor model is on "GPU" too. But error is same. Could I code rewiev for my problem or this error can be lib error. But I have used Torch1.13 - cuda11.7.

rvorias commented 1 year ago

can you paste 1) your training or inference code here? 2) you detailed error code

Emyyr commented 1 year ago

Screenshot 2023-02-06 013326 Screenshot 2023-02-06 013215 Screenshot 2023-02-06 013257

Second error is : "[RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same]" But you will see in the code I send data and model to GPU!.But feature extractor return None. Debug the code and I see this error comes from feature extractor. Feature extrator was in GPU but input was not. Than I changed input type and tried but None type error. Can u explain this error? I have searched so much but I didn't solve this. Thanks.

rvorias commented 1 year ago

It seems that the model you are loading and the model definition have a mismatch. Either up/downgrade the timm lib such that the resnet definition corresponds, or load another model.

The model isn't None. Rather, the model doesn't have a self.drop_block.

Emyyr commented 1 year ago

It seems that the model you are loading and the model definition have a mismatch. Either up/downgrade the timm such that the resnet definition corresponds, or load another model.

I can run this code with torch-cpu(2 different env). If I got mistmatch I couldn't run. But I will check timm version. Thanks for reply.

rvorias commented 1 year ago

Instead of using torch.load, can you use the timm model creation? Then you can be more sure it's a version mismatch.

Emyyr commented 1 year ago

Thanks. I checked my lib versions. And I change to timm version (0.6.12 --> 0.4.12) and install faiss again. This solve my problem. But I want to ask something about prediction time. I ran my model on GPU and prediction time was : 40 ms avg. Than ran to CPU prediction time was :40 ms? I didn't understand and I am sure about the veriables are in cpu or gpu. I have 64 RAM 12 GB 3060 GTX. Do you have an idea for this issue?

rvorias commented 1 year ago

I haven't tried measuring gpu<>cpu performance yet, but gpu should be quite a bit faster than cpu.