dusty-nv / jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
https://developer.nvidia.com/embedded/twodaystoademo
MIT License
7.87k stars 2.98k forks source link

All images have been classified as DOG #1030

Closed AntonP95 closed 1 year ago

AntonP95 commented 3 years ago

Hi Dusty, I can't stop having the same problem every time I follow the example: imagenet --model=models/cat_dog/resnet18.onnx --labels=data/cat_dog/labels.txt --input_blob=input_0 --output_blob=output_0 data/cat_dog/test/cat data/cat_dog/test_cat_output the example classifies the first cougar as 69% confidence that is a dog, every cat image has been classified as dog. I don't know how can I get this results if I ran the epochs for the default of 35 epochs. But was not the first time, like the fist time I ran the example following the number of epochs that you stablish in the video. But now, with 35 epochs, I'm still getting the same issue, all images of cats and dogs have been classified as DOGS. I don't know what to do because after getting the same problem with 35 epochs, i'm affraid that I will get the same result with 100 epochs.

AntonP95 commented 3 years ago

Also I could see that almost every classification has a accuracy of 30% cat/ 70% dog, more or less .

dusty-nv commented 3 years ago

Hmm - can you try deleting the .engine file in your models directory (under models/cat_dog) and re-running the imagenet program?

When you trained your model, what accuracy did the training script say that it achieved after 35 epochs?

Also can you test my cat_dog model from here? https://nvidia.box.com/s/zlvb4y43djygotpjn6azjhwu0r3j0yxc

AntonP95 commented 3 years ago

Okey, I managed to delete the .engine file and re-running the imagenet. So now every result and classification seems correct. Problem solve!!! I tried to delete the .engine file but I couldn't since I didn't have the permissions. Just you said that I had to delete, looked for a solution and It works!! Thank you!

dusty-nv commented 3 years ago

OK gotcha, great. Yes if it gives you permission denied, you can use sudo

It was probably running off old model, so after you deleted the .engine, TensorRT re-generated it based on your latest ONNX.

jamescodesthings commented 2 years ago

Hey guys,

I’m here because I had a similar issue. The cat test image was predicted at 59% dog.

Unfortunately I lost the console output of the training.py command so I’m not sure on the final accuracy at 35 epochs.

Deleting the .engine did not help in my case but the 100 epoch model predicted 94% cat.

Will try training myself again overnight and see if I can get it going well.

jamescodesthings commented 2 years ago

To update on my situation:

Retrained overnight with the same instructions and now it sees the cat as a cat.

Must have tripped up somewhere in the original training and not noticed.

My advise for people who get here with similar issues: