Open lilhoser opened 1 year ago
Hi Aaron, it's hard to say exactly, but does the viewing angle or lighting conditions vary from your home cameras?
When you tested your model on your test dataset, did you do that with detectnet/detectnet.py?
I think the natural thing would be to augment your dataset with imagery from your actual cameras to make your model more robust.
From: Aaron LeMasters @.> Sent: Monday, July 24, 2023 1:42:10 PM To: dusty-nv/jetson-inference @.> Cc: Subscribed @.***> Subject: [dusty-nv/jetson-inference] Transfer learning on ssd-mobilenet (Issue #1705)
I followed the tutorial to fine-tune ssd-mobilenet: python train_ssd.py --dataset-type=voc --data=data/delivery --model-dir=models/delivery
I used a dataset from Roboflow that has labeled images for logos for delivery trucks: https://universe.roboflow.com/capstoneproject/logoimages
My goal is to classify delivery trucks that appear in my home camera streams.
I manually created the VOC XML directory structure and copied all the files in the right place. I successfully re-trained the foundation model and it correctly classifies images in the dataset's test folder. However, it does not work on any images I captured on my home cameras. I tried different resolutions, file formats, and experimented with cropping just the delivery truck. No luck. What are the standard next steps for diagnosing?
Thanks!
— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/jetson-inference/issues/1705, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGK7Q6KRHWSIDKAHHDITXR2XXFANCNFSM6AAAAAA2V5JC7A. You are receiving this because you are subscribed to this thread.Message ID: @.***>
The viewing angles and lighting conditions definitely vary from the training/validation/test data set and my images captured from home.
I tested the model using detectnet, yes.
I can certainly retrain on my own images, but this makes me question the utility of open source dataset repositories like openimages, kaggle, stanford, etc, if slight variations like lighting and viewing angles can cause such an impact. Is this the state of the technology or could there be something else in play?
I followed the tutorial to fine-tune ssd-mobilenet:
python train_ssd.py --dataset-type=voc --data=data/delivery --model-dir=models/delivery
I used a dataset from Roboflow that has labeled images for logos for delivery trucks: https://universe.roboflow.com/capstoneproject/logoimages
My goal is to classify delivery trucks that appear in my home camera streams.
I manually created the VOC XML directory structure and copied all the files in the right place. I successfully re-trained the foundation model and it correctly classifies images in the dataset's test folder. However, it does not work on any images I captured on my home cameras. I tried different resolutions, file formats, and experimented with cropping just the delivery truck. No luck. What are the standard next steps for diagnosing?
Thanks!