Open mickeypash opened 6 years ago
Hi @mickeypash, thanks for your interest in taking this on! This is a good impetus to start laying out some guides for developers, so I'll use this issue as an opportunity to provide a fairly detailed overview, and will then work this up into a full doc section at some point. Feel free to ask questions if anything's unclear.
In general, we've tried to ease the pain of writing new Transformer
classes by abstracting away most of the I/O stuff and letting developers focus almost exclusively on the internal transformation logic. In practice, what this means is that a feature Extractor
minimally just has to fill out the following skeleton:
class MyNewExtractor(ImageExtractor):
def _extract(self, stim):
# Must return an instance of class ExtractorResult!
...
There's a pre-defined hierarchy of Transformer
classes that you can inherit from if you're doing something fairly conventional. E.g., if you want to add a new Extractor
that you know is only going to deal with images as inputs, you should inherit from ImageExtractor
as above.
For Converter
and Filter
classes (see the docs for a detailed explanation of the differences), the same logic applies. Note that the method you implement changes based on the Transformer
subclass. In the case of a Converter
, you implement _convert()
; in the case of a Filter
, you implement _filter
.
Note that the transformation method (i.e., _extract
, _convert
, or _filter
) must not take any arguments other than stim
. This means that any configuration you want to do must be done in the initializer (which there are no constraints on, so you can do whatever you like there). What the method returns is also constrained, and depends on the Transformer
type. Extractor
classes return instances of ExtractorResult
(see existing Extractor
classes for examples of how to initialize these). Converter
classes return a Stim
of a different type than the input. Filter
classes return a Stim
of the same type as the input.
Beyond these minimal requirements, there are a bunch of other conventions and utilities you can take advantage of to minimize work. For example:
class MyNewAPIExtractor(ImageExtractor):
_log_attributes = ('param1', 'param2')
_env_keys = (EXAMPLE_API_PARAM1, EXAMPLE_API_PARAM2)
_version = '0.1'
_batch_size = 100
def __init__(self, param1, param2):
self.param1 = param1
self.param2 = param2
def _extract(self, stim):
# Must return an instance of class ExtractorResult!
...
The class attributes do some useful stuff for you (and also for the user):
_log_attributes
indicates which parameters define a unique Transformer
instance. Basically, any parameter names you include in the tuple will be used (a) to memoize the output of the Transformer
(potentially saving users time and/or money) and (b) as part of the transformation history/log generated for each Stim
(so that users know exactly how a Transformer
was initialized). This means that you should include any parameter in this list that can affect the output of the _transform
call._env_keys
: This indicates which keys needed for API access to try to read out of the environment. This doesn't give you any extra functionality right now (though it should probably at least map those environment keys onto named variables, for convenience), and you're encouraged to accept the same variables as initialization arguments (see any of the existing API extractors for examples). But it should be added anyway for informational purposes._version
: This is informal version tracking. Stable Transformer
classes should be assigned 1.0; thereafter, major API-breaking changes should prompt a major version change, and minor improvements or very small breaking changes should prompt a minor version change. We're not currently enforcing this in any way, but that's the idea._batch_size
: There's a BatchTransformerMixin
class you can inherit from if you're writing a Transformer
that is able to batch operations. See the docstring for further explanation. I imagine this won't be applicable for Amazon Rekognition given that you'll probably have to go through S3.In addition to the transformation logic itself, you're also encouraged to implement a ._to_df()
method in any Extractor
classes you write. This is a method that should take an ExtractorResult
as input, and return a pandas DataFrame as output. Then, the expectation is that you take the .raw
attribute of that ExtractorResult
(which should contain "raw" results retrieved from the feature extraction service), and process them into a nice DataFrame. There's more to be said about this, but I'm happy to provide more input once you get to that stage. It's not mandatory, as we very recently changed the internal API to work this way. But it would probably make sense to implement new Extractor
classes this way. As a relatively simple example, you can take a look at pliers.extractors.image.FaceRecognitionFeatureExtractor
and its subclasses.
Aside from that, it's really up to you how you want to implement support for Rekognition (or for any other service). Some general tips/suggestions:
Transformer
classes as modular as possible. For most of the major API services, it's possible to abstract a lot of the commonality into base classes. The GoogleAPITransformer
hierarchy is the best example of this. We actually have a separate transformers.google
module just because so many of the Google Cloud-based Converter
and Extractor
classes share functionality.pliers.transformers.base
or pliers.extractors.base
at all. But if you come across any problems that can't be resolved without rearchitecting some of the core logic, feel free to bring it up for discussion. This might be more likely for Rekognition than for other services given its additional requirements (i.e., to store the media files on AWS).Transformer
classes are generally organized by modality (e.g., pliers.transformers.audio
, pliers.transformers.image
, etc.), but in the case of major services, it's fine to bundle them all in a separate module. E.g., we have a pliers.transformers.google
module, and the same will probably make sense for Rekognition.attempt_to_import
call at the top. Then, when you need to verify that the import exists, you call verify_dependencies
. See the classes in pliers.extractors.api
for many examples.I'm probably forgetting things, but that's what comes to mind. Feel free to ask questions or bring up issues here as you run into them. Thanks!!
On further inspection, I think this issue can be broken up into several steps. The Rekognition API supports a number of services that don't require S3. In boto3's Rekognition client, these include all of the detect_*
methods (e.g., detect_faces()
). These could probably be minimally implemented by putting most of the logic in a core AmazonRekognitionImageExtractor
base class from which the face, label, text, etc. detectors inherit. Given that we can wrap boto3 for most requests (rather than querying the API directly), this will likely involve less work than the corresponding Google or Azure APIs.
Beyond those detection methods, we start to get into functionality that requires S3 support. The next step would probably be to extend the extractors created in the above step to accept S3 inputs. The easiest way to handle this would be to let the user just pass in their S3 credentials and bucket information either at Extractor
initialization, or even in the transform
call (since bucket may change from image to image). But it would be nicer to create an abstraction that lets users set up a fixed bucket for multiple calls.
Beyond that, we get into video-extraction territory. Here things get even more complex, because these extractors are asynchronous, so, as with the Google Cloud Video Intelligence services, we need to wait for the request to complete. Many of the video-based tools have the further complication that they work with collections stored on S3 (e.g., extracting faces from a video as they're encountered, so that they can be detected and tracked later on). So then we need to not only create collections, but pass them to the extractors as needed, and then, once completed, process the results into a form we can eventually return in to_df
.