PsychoinformaticsLab / pliers

Automated feature extraction in Python
https://pliers.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
298 stars 68 forks source link

AWS Rekognition Integration #245

Open mickeypash opened 6 years ago

mickeypash commented 6 years ago
tyarkoni commented 6 years ago

Hi @mickeypash, thanks for your interest in taking this on! This is a good impetus to start laying out some guides for developers, so I'll use this issue as an opportunity to provide a fairly detailed overview, and will then work this up into a full doc section at some point. Feel free to ask questions if anything's unclear.

In general, we've tried to ease the pain of writing new Transformer classes by abstracting away most of the I/O stuff and letting developers focus almost exclusively on the internal transformation logic. In practice, what this means is that a feature Extractor minimally just has to fill out the following skeleton:


class MyNewExtractor(ImageExtractor):

    def _extract(self, stim):
        # Must return an instance of class ExtractorResult!
        ...

There's a pre-defined hierarchy of Transformer classes that you can inherit from if you're doing something fairly conventional. E.g., if you want to add a new Extractor that you know is only going to deal with images as inputs, you should inherit from ImageExtractor as above.

For Converter and Filter classes (see the docs for a detailed explanation of the differences), the same logic applies. Note that the method you implement changes based on the Transformer subclass. In the case of a Converter, you implement _convert(); in the case of a Filter, you implement _filter.

Note that the transformation method (i.e., _extract, _convert, or _filter) must not take any arguments other than stim. This means that any configuration you want to do must be done in the initializer (which there are no constraints on, so you can do whatever you like there). What the method returns is also constrained, and depends on the Transformer type. Extractor classes return instances of ExtractorResult (see existing Extractor classes for examples of how to initialize these). Converter classes return a Stim of a different type than the input. Filter classes return a Stim of the same type as the input.

Beyond these minimal requirements, there are a bunch of other conventions and utilities you can take advantage of to minimize work. For example:


class MyNewAPIExtractor(ImageExtractor):

    _log_attributes = ('param1', 'param2')
    _env_keys = (EXAMPLE_API_PARAM1, EXAMPLE_API_PARAM2)
    _version = '0.1'
    _batch_size = 100

    def __init__(self, param1, param2):
        self.param1 = param1
        self.param2 = param2

    def _extract(self, stim):
        # Must return an instance of class ExtractorResult!
        ...

The class attributes do some useful stuff for you (and also for the user):

In addition to the transformation logic itself, you're also encouraged to implement a ._to_df() method in any Extractor classes you write. This is a method that should take an ExtractorResult as input, and return a pandas DataFrame as output. Then, the expectation is that you take the .raw attribute of that ExtractorResult (which should contain "raw" results retrieved from the feature extraction service), and process them into a nice DataFrame. There's more to be said about this, but I'm happy to provide more input once you get to that stage. It's not mandatory, as we very recently changed the internal API to work this way. But it would probably make sense to implement new Extractor classes this way. As a relatively simple example, you can take a look at pliers.extractors.image.FaceRecognitionFeatureExtractor and its subclasses.

Aside from that, it's really up to you how you want to implement support for Rekognition (or for any other service). Some general tips/suggestions:

I'm probably forgetting things, but that's what comes to mind. Feel free to ask questions or bring up issues here as you run into them. Thanks!!

tyarkoni commented 5 years ago

On further inspection, I think this issue can be broken up into several steps. The Rekognition API supports a number of services that don't require S3. In boto3's Rekognition client, these include all of the detect_* methods (e.g., detect_faces()). These could probably be minimally implemented by putting most of the logic in a core AmazonRekognitionImageExtractor base class from which the face, label, text, etc. detectors inherit. Given that we can wrap boto3 for most requests (rather than querying the API directly), this will likely involve less work than the corresponding Google or Azure APIs.

Beyond those detection methods, we start to get into functionality that requires S3 support. The next step would probably be to extend the extractors created in the above step to accept S3 inputs. The easiest way to handle this would be to let the user just pass in their S3 credentials and bucket information either at Extractor initialization, or even in the transform call (since bucket may change from image to image). But it would be nicer to create an abstraction that lets users set up a fixed bucket for multiple calls.

Beyond that, we get into video-extraction territory. Here things get even more complex, because these extractors are asynchronous, so, as with the Google Cloud Video Intelligence services, we need to wait for the request to complete. Many of the video-based tools have the further complication that they work with collections stored on S3 (e.g., extracting faces from a video as they're encountered, so that they can be detected and tracked later on). So then we need to not only create collections, but pass them to the extractors as needed, and then, once completed, process the results into a form we can eventually return in to_df.