Language: C#
Training required?: No
Input: Image as stream to API Endpoint
Returns: JSON containing info about detected objects (with descriptions), bounding boxes and overall image description
Good for live video feed?: No since doing a frame per frame detection would consume too many API endpoint calls.
Works only for really basic and day to day used objects like bottle, cups etc. If I want to detect a cactus it doesn't work.
Azure Custom Vision
Language: C#
Training required?: Yes
You have to upload images and label all objects of every image. This can be done programmatically by providing a JSON with bounding box regions and a label or with an online image tagger provided by MSFT.
Training a cactus worked well and the detection was kinda ok. The training dataset wasn't distinct enough to provide satisfactory results though.
Input: Image as stream to API Endpoint
Returns: JSON containing info about detected objects (with descriptions), bounding boxes and overall image description
Good for live video feed?: No since doing a frame per frame detection would consume too many API endpoint calls.
Language: Python
Training required?: Optional
YOLOv3 has only 80 classes which it was trained with which represent most day to day objects (bottle, cups, cars etc). For more specific objects like a cactus you have to individually re-train the model.
Input: Image as stream to library
Returns: Object containing info about detected objects (with descriptions) and bounding boxes
Good for live video feed?: Yes though very slow. With fastest approach I was able to get 2FPS
Offline Model based on MobileNet SSD CPU
Language: Python
Training required?: Optional
Input: Image as stream to library
Returns: Object containing info about detected objects (with descriptions) and bounding boxes
Good for live video feed?: Very fast. Probably best model for live video prediction
The results though are very unreliable and work only for really basic things but training this model on a custom dataset could prove to be good.
Language: Python
Training required?: Optional
YOLOv5 like YOLOv3 has only 80 classes which it was trained with which represent most day to day objects (bottle, cups, cars etc). For more specific objects like a cactus you have to individually re-train the model.
Input: Image as stream to library
Returns: Object containing info about detected objects (with descriptions) and bounding boxes
Good for live video feed?: Yes much faster than YOLOv3. I get multiple FPS. So it's slower than MobilNet SSD but more precise
How to label images for offline models
To label images for offline models there is Roboflow which does a really good job.
https://app.roboflow.com/
After labeling all images you can export them in the format of your desired model.
Azure Computer Vision
Language: C# Training required?: No Input: Image as stream to API Endpoint Returns: JSON containing info about detected objects (with descriptions), bounding boxes and overall image description Good for live video feed?: No since doing a frame per frame detection would consume too many API endpoint calls.
Works only for really basic and day to day used objects like bottle, cups etc. If I want to detect a cactus it doesn't work.
Azure Custom Vision
Language: C# Training required?: Yes You have to upload images and label all objects of every image. This can be done programmatically by providing a JSON with bounding box regions and a label or with an online image tagger provided by MSFT. Training a cactus worked well and the detection was kinda ok. The training dataset wasn't distinct enough to provide satisfactory results though.
Input: Image as stream to API Endpoint Returns: JSON containing info about detected objects (with descriptions), bounding boxes and overall image description Good for live video feed?: No since doing a frame per frame detection would consume too many API endpoint calls.
Offline Model based on YOLOv3 CPU (ImageAI)
https://github.com/OlafenwaMoses/ImageAI
Language: Python Training required?: Optional YOLOv3 has only 80 classes which it was trained with which represent most day to day objects (bottle, cups, cars etc). For more specific objects like a cactus you have to individually re-train the model.
Input: Image as stream to library Returns: Object containing info about detected objects (with descriptions) and bounding boxes Good for live video feed?: Yes though very slow. With fastest approach I was able to get 2FPS
Offline Model based on MobileNet SSD CPU
Language: Python Training required?: Optional
Input: Image as stream to library Returns: Object containing info about detected objects (with descriptions) and bounding boxes Good for live video feed?: Very fast. Probably best model for live video prediction
The results though are very unreliable and work only for really basic things but training this model on a custom dataset could prove to be good.
Video: https://streamable.com/3qh0mf
Offline Model based on YOLOv5 CPU (PyTorch)
https://github.com/OlafenwaMoses/ImageAI
Language: Python Training required?: Optional YOLOv5 like YOLOv3 has only 80 classes which it was trained with which represent most day to day objects (bottle, cups, cars etc). For more specific objects like a cactus you have to individually re-train the model.
Input: Image as stream to library Returns: Object containing info about detected objects (with descriptions) and bounding boxes Good for live video feed?: Yes much faster than YOLOv3. I get multiple FPS. So it's slower than MobilNet SSD but more precise
How to label images for offline models
To label images for offline models there is Roboflow which does a really good job. https://app.roboflow.com/
After labeling all images you can export them in the format of your desired model.