dusty-nv / jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
https://developer.nvidia.com/embedded/twodaystoademo
MIT License
7.82k stars 2.98k forks source link

Transfer learning on ssd-mobilenet #1705

Open lilhoser opened 1 year ago

lilhoser commented 1 year ago

I followed the tutorial to fine-tune ssd-mobilenet: python train_ssd.py --dataset-type=voc --data=data/delivery --model-dir=models/delivery

I used a dataset from Roboflow that has labeled images for logos for delivery trucks: https://universe.roboflow.com/capstoneproject/logoimages

My goal is to classify delivery trucks that appear in my home camera streams.

I manually created the VOC XML directory structure and copied all the files in the right place. I successfully re-trained the foundation model and it correctly classifies images in the dataset's test folder. However, it does not work on any images I captured on my home cameras. I tried different resolutions, file formats, and experimented with cropping just the delivery truck. No luck. What are the standard next steps for diagnosing?

Thanks!

dusty-nv commented 1 year ago

Hi Aaron, it's hard to say exactly, but does the viewing angle or lighting conditions vary from your home cameras?

When you tested your model on your test dataset, did you do that with detectnet/detectnet.py?

I think the natural thing would be to augment your dataset with imagery from your actual cameras to make your model more robust.


From: Aaron LeMasters @.> Sent: Monday, July 24, 2023 1:42:10 PM To: dusty-nv/jetson-inference @.> Cc: Subscribed @.***> Subject: [dusty-nv/jetson-inference] Transfer learning on ssd-mobilenet (Issue #1705)

I followed the tutorial to fine-tune ssd-mobilenet: python train_ssd.py --dataset-type=voc --data=data/delivery --model-dir=models/delivery

I used a dataset from Roboflow that has labeled images for logos for delivery trucks: https://universe.roboflow.com/capstoneproject/logoimages

My goal is to classify delivery trucks that appear in my home camera streams.

I manually created the VOC XML directory structure and copied all the files in the right place. I successfully re-trained the foundation model and it correctly classifies images in the dataset's test folder. However, it does not work on any images I captured on my home cameras. I tried different resolutions, file formats, and experimented with cropping just the delivery truck. No luck. What are the standard next steps for diagnosing?

Thanks!

— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/jetson-inference/issues/1705, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGK7Q6KRHWSIDKAHHDITXR2XXFANCNFSM6AAAAAA2V5JC7A. You are receiving this because you are subscribed to this thread.Message ID: @.***>

lilhoser commented 1 year ago

The viewing angles and lighting conditions definitely vary from the training/validation/test data set and my images captured from home.

I tested the model using detectnet, yes.

I can certainly retrain on my own images, but this makes me question the utility of open source dataset repositories like openimages, kaggle, stanford, etc, if slight variations like lighting and viewing angles can cause such an impact. Is this the state of the technology or could there be something else in play?