Design of the custom classification model, second shot in the new pipeline (#179). This model will classify the shape, character, shape color, and character color in one go, given the shape crop.
I'd like help with working through the ideas to design the model.
Design
Considerations
The model should be performant, ideally more so than YOLOv8n-det.
Something of a similar speed to YOLOv8n-det is acceptable, considering we currently process at ~2fps, and only ~1fps should be necessary.
Our datasets are not big and not that diverse. My hypothesis for our low real-world generalization performance is that our targets are too similar, perfect, and high-contrast compared to real-world data.
Our target crops will be of somewhat different sizes, and we should think about resizing it in a "good" way.
Ideas
YOLOv8n-det should be considered.
Use an autoencoder trained on diverse, unlabeled data (can be pre-trained or in-house). This should help with feature extraction and mitigate some of the problems with our smaller datasets.
If we do our own, we can only care about the area in the mask.
We need to augment our data manually since YOLO isn't helping. Since it's our own model, we should be able to do on-line augmentation with something like albumentations.
Maybe collect more IRL data that we can use in validation. We probably don't have enough to train on it.
Some traditional CV pre-processing.
Sharpening and contrast enhancement. Could be useful, but, in theory, the model should be able to learn this easily. That being said, these are such easy steps that might make it easier to train.
Force the first shot to return square, slightly-enlarged bounding boxes to maximize information to this model and decrease the chance that we accidentally crop part of it.
Action Items
[x] Merge our datasets from different sources to make them more diverse #177.
[ ] Maybe collect more IRL data. Even without labeling, we could use it in an unsupervised manner.
Description
Design of the custom classification model, second shot in the new pipeline (#179). This model will classify the shape, character, shape color, and character color in one go, given the shape crop.
I'd like help with working through the ideas to design the model.
Design
Considerations
The model should be performant, ideally more so than YOLOv8n-det.Ideas
albumentations
.Action Items