photoprism / photoprism

AI-Powered Photos App for the Decentralized Web 🌈💎✨
https://www.photoprism.app
Other
34.85k stars 1.92k forks source link

Proof-of-concept for scene category classification #175

Open lastzero opened 4 years ago

lastzero commented 4 years ago

As a PhotoPrism user, I want my photos classified by scene category so that I can filter search results by scene and get better image titles.

While we already label certain scenes based on the objects we find, we don't have a specialized TensorFlow model for this yet e.g. AlexNet-places365, GoogLeNet-places365 or ResNet152-places365. See also http://places2.csail.mit.edu/download.html.

Ideally we can reuse our existing Go TensorFlow code for this, but each model is different in how it must be used. Our NSFW detector for example needs different input values than our Nasnet model for object classification, so we ended up using different code and different packages. For scene detection, it might be good to create a new scene package unless merging it with Nasnet gives us much better performance.

Acceptance Criteria:

lastzero commented 4 years ago

Moved our image classification code to the new classify package: https://github.com/photoprism/photoprism/commit/e9874d6e0cfb81d2fce4e5be34455848254685d7

Should be easier to test, simply go to the directory and run go test -v.

We'll see if that's a good name... was the best a could come up with today.

lastzero commented 4 years ago

Updated our docs: https://github.com/photoprism/photoprism/wiki/Image-Classification

lastzero commented 4 years ago

FYI: https://github.com/nic25 is working on this 🚀

lastzero commented 4 years ago

Didn't find related models on TensorFlow Hub.

tam-wh commented 4 years ago

seems like there's a way to convert places365 caffemodel to tensorflow? https://ndres.me/post/convert-caffe-to-tensorflow/

tam-wh commented 4 years ago

i successfully converted vgg16_hybrid1365 into pb file (took me half a day to get the converter working) and it works well in Tensorflow .NET. I'm now working on the non-hybrid model as it is more suitable for scene classification

ResNet152-places365 (does not convert) VGG16-hybrid1365 (converted successfully) VGG16-places365 (working on it)

tam-wh commented 4 years ago

The file is quite huge, ~500MB. Now shared on my nextcloud VGG16-places365. Its quite slow as my vps download speed is limited to 150KB. Label files are available on places365 github

Some information on the model

Input operation name = "data";
Output operation name = "prob"

Took a couple of images from the demo here and run it through the converted pb file in Tensorflow.NET. Image width & height set to 224px.

IMG_7308 48 /b/beach 48, 0.86291456

13 66 /b/bridge 66, 0.9400765

Seems like conversion works really well

Extarys commented 4 years ago

Not sure if it's possible, but could the label "water" be added? Maybe "boat" for the second picture?

I don't really know how those things work though :confused: