marteinn / wagtail-alt-generator

Insert image description and tags with the help of computer vision
MIT License
83 stars 13 forks source link

Support for AWS Rekognition (and other Cloud services) #1

Closed robmoorman closed 7 years ago

robmoorman commented 7 years ago

Please see https://aws.amazon.com/rekognition/

Would be nice to support this as well (can help with that by making PR). I will submit some requirements in a couple of days to provide this issue/suggestion with more details.

tomdyson commented 7 years ago

Also Google Cloud Vision - https://cloud.google.com/vision/ - which supports tagging but not descriptions. Perhaps we could aim for pluggable backends.

marteinn commented 7 years ago

Hi @robmoorman and thanks for the suggestion!

When building this library I originally benchmarked the api against google vision, but as @tomdyson pointed out, it does not return a description. (So I initially settled with Microsofts Conitive api).

But the Cognitive api has some downsides:

I have not yet looked thoroughly into the new AWS api, but it looks interesting, and would fit my own needs better (I rely a lot on AWS).

Moving ahead I think @tomdayson suggestion with a pluggable backend is the way to go, using a interface somewhat like how Wagtail uses the custom image model. With Microsofts cognitive api built in (since it only requires requests as a requirement) and any other as a external library (to keep down the amount of requirements).

I will start look into refactoring the cognitive api into a backend and define a interface for it. Adding AWS rekognition after that should be pretty easy and I would love to get some help with that.

How does this sound?

marteinn commented 7 years ago

I took some time and refactored the existing computer vision api and created a pluggable provider. I've added it into another branch called "providers" (https://github.com/marteinn/wagtail-alt-generator/tree/feature/providers).

So far it feels pretty good and swapping the provider would only require a update in ALT_GENERATOR_PROVIDER and that the provider implements AbstractProvider and returns a DescriptionResult VO.

robmoorman commented 7 years ago

Great @marteinn the swappable provider is the way to go! I do noticed that Rekognition also doesn't provide a title. However we can come up with a solution for that. a) don't set the title or b) assemble a title according received segments found in the image (dog, people, face, bike...).

For AWS specific I think it's best to advice the user to use S3 for media storage, since traffic internally at AWS is free. Instead of hosting media and let Rekognition always download the image over public netwerk. Probably the most users do already host media on S3 when there are on AWS.

I'll do some testing in the upcoming weekend and check your providers feature branch, thanks already for the replies on short notice! Maybe I can already do a PoC of the AWS Rekognition of I have enough time :).

robmoorman commented 7 years ago

He @marteinn please see the PR for AWS support as well.

Found that the abstraction if pretty simple to use. Maybe we need to do some small design considerations, like exception handling if the image couldn't be downloaded etc. and some good docs.

Do you have time/plans to make this module production ready. I would help with that as I really like suggest this to x of my clients.

marteinn commented 7 years ago

Thanks for this @robmoorman! Will look into your PR in the upcoming days :)

marteinn commented 7 years ago

Hi! I just merged the feature/provider branch back into develop.

Some things to notice:

tomdyson commented 7 years ago

Great work @marteinn and @robmoorman!

Would love to have support for google vision, but I think its better to get this one out before adding support?

Yes, I agree.

marteinn commented 7 years ago

Pluggable providers and AWS Rekognition support is out now. Will close this ticket and open new ones for:

Thanks for all the awesome input :)