Open derneuere opened 3 years ago
Well I tried it, and it just crashed :/ Maybe somebody else will have more luck with it. We could also switch to a different object detection and classification framework.
Related to #36
Maybe we should try https://github.com/OlafenwaMoses/ImageAI
Or Yolov4: https://github.com/AlexeyAB/darknet
We should maybe use this. Looks like the most used framework: https://github.com/tensorflow/models/blob/master/research/object_detection/colab_tutorials/inference_tf2_colab.ipynb
https://github.com/facebookresearch/detectron2 with LVIS Instance Segmentation Baselines with Mask R-CNN model https://github.com/facebookresearch/detectron2/blob/master/GETTING_STARTED.md
@airfield20 Hey, I saw your post in the ownphotos thread. Are you still interested in implementing object detection for the project?
Yes, I'd like to contribute if I can. I did not know the project was active again.
Could you write a python script that implements YOLO Object Detection? Input would be a picture or image path and output would be the list of objects as strings and their confidence value.
Also, some install instructions for YOLO would be great so that I can implement in the dockerfile 👍
Sure, should I submit a pull request to the dev branch with the file in the root directory? or just post it here
I think it will be easiest to use the YOLO classifier that's built into opencv, to ensure maximum compatibility. If we test another better performing classifier later on that may depend on specific hardware, we could add a configuration option for the user to select which system they prefer.
Sounds like a good idea!
Pull request, but put the file in api folder 👍
I don't know much about classifiers, but could you choose a model that is able to find a lot of different object classes?
@derneuere
testing yolov4 w/ opencv. Hows this?
Also these are the class names that can be detected: https://raw.githubusercontent.com/hhk7734/tensorflow-yolov4/master/test/dataset/coco.names
Just basing my implementation on this gist https://gist.github.com/YashasSamaga/e2b19a6807a13046e399f4bc3cca3a49
Looks good! 👍 But I would prefer more classes. This one seems to support up to 9000: https://github.com/philipperemy/yolo-9000 Could you try if the cfg and the weight files are compatible?
cfg is here: https://github.com/pjreddie/darknet/tree/61c9d02ec461e30d55762ec7669d6a1d3c356fb2/cfg You have to download this folder https://github.com/philipperemy/yolo-9000/tree/master/yolo9000-weights and do this: cat yolo9000-weights/x* > yolo9000-weights/yolo9000.weights # it was generated from split -b 95m yolo9000.weights
Just tried that and the model will not initialize using cv2.dnn_DetectionModel class
Hmm, in https://github.com/AlexeyAB/darknet there are the following tips. Seems to be a fork:
186 MB Yolo9000 - image: darknet.exe detector test cfg/combine9k.data cfg/yolo9000.cfg yolo9000.weights Remember to put data/9k.tree and data/coco9k.map under the same folder of your app if you use the cpp api to build an app
@derneuere I've managed to get yolo-9000 working using https://pypi.org/project/darknetpy/
code snippet:
from darknetpy.detector import Detector
detector = Detector('/home/aaron/Repos/yolo-9000/darknet/cfg/combine9k.data',
'/home/aaron/Repos/yolo-9000/darknet/cfg/yolo9000.cfg',
'/home/aaron/Repos/yolo-9000/yolo9000-weights/yolo9000.weights')
results = detector.detect('/home/aaron/Repos/librephotos/api/yolo/test_images/pets.png')
print(results)
the cfg file requires the data folder from https://github.com/pjreddie/darknet/tree/61c9d02ec461e30d55762ec7669d6a1d3c356fb2
I think this detector.detect function has the interface that you require.
@derneuere do you want this wrapped in a function in the API folder or is this good enough?
Yes, that would be great 👍 I also need some install instructions. Do I only have to download the files and add darknetpy to the requirements or do I have to do more?
darknetpy relies on clang to be installed as well. which can be installed via apt. I will write more in depth instructions and post them here after I create the PR.
I got it to work and opened up a pull request: https://github.com/LibrePhotos/librephotos/pull/142
But I have a memory issue see here: https://github.com/danielgatis/darknetpy/issues/31
We should use MobileNetV3 to implement object detection: https://pytorch.org/vision/stable/models.html
import torch
import cv2
from torchvision import transforms, models
import torchvision
mobilenet_v3 = models.mobilenet_v3(pretrained=True)
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
mobilenet_v3(normalize(images))
// The images should be resized to 224x224 because that's the size torchvision likes.
image=torch.tensor(cv2.resize(cv2.imread("individualImage.png"),(224,224))/255.0).to(torch.float32).permute(2,0,1).unsqueeze(0)(edited)
out=mobilenet_v3(normalize(image))
// Textfile with the classes: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a
imagenetidx[int(out.argmax())]
The project uses densecap, but the original author disabled it. There is no explanation why it was disabled. We should be trying to get it to run and evaluate if it is usable or if we need a new machine learning model.
https://github.com/LibrePhotos/librephotos/blob/289f413c303bc06e04de2c8c8decb764ba86481c/api/models.py#L206-L221
We need to also need to change this function to add it to the AlbumThings:
https://github.com/LibrePhotos/librephotos/blob/289f413c303bc06e04de2c8c8decb764ba86481c/api/models.py#L496-L515