jina-ai / jina-hub

An open-registry for hosting Jina executors via container images
Apache License 2.0
104 stars 47 forks source link

Implement ImageNormalizer based on Torch Transforms #7796

Closed JoanFM closed 3 years ago

JoanFM commented 3 years ago

ImageNormalizer executor is hard to make it use batching, because it is based on havinf PIL images for cropping.

May be a good idea to have an ImageNormalizer based on PyTorch transforms that can allow batching.

In this case, batching would not be enforced by the batching decorator, but by the batching parameter passed to the Torch transformer itself.

This can impact very positively the performance of Image Search soultions

bsridatta commented 3 years ago

Hello, may I ask why the ImageNormalier crafter has cropping and resize functionality, while there are dedicated crafter for both. Is it not redundant or is it just for ease of use?

JoanFM commented 3 years ago

Hey @bsridatta,

I believe it is for ease of use. I find it however weird that it forces to crop the image. I think resizing can be quite expected and common, however cropping may feel weird

bsridatta commented 3 years ago

Understood, if it's for ease of use, the crop is also required to handle aspect ratios as well (16:9 -> 1:1) as resize maintains the ratio.

So the idea here is to have another TorchImageNormalizer crafter that replaces the crop and resize methods with torchvision.transforms that take numpy as tensors without using PIL right?

JoanFM commented 3 years ago

The idea would be this yes, or maybe see if it would make sense to have an ImageTorchTransformation executor that can be used to do preprocessing of images using torchvision.transforms similar to what the AlbumentationsCrafter does

bsridatta commented 3 years ago

Cool, I actually overlooked that there was albumentations crafter.

I think what you propose is a crafter just like AlbumentationsCrafter but with torchvision that performs a list of transforms. (-) the already existing Albumentations is significantly faster torchvision for most tasks and it also supports other frameworks (-) So ideally this ImageTorchTransformation would be better only when doing few transforms which are faster than albumentations. (+) Ya, that would be much better than just having TorchImageNormalizer which does 3 things (crop, resize). (+) I see that it would helpful if someone wants to work with torchvision purely.

So can I try ImageTorchTransformation crafter then?

JoanFM commented 3 years ago

Hey @bsridatta ,

U can try this.

The benefit is also that TorchTransformations can work on batches which I am not sure AlbumentationsCrafters do.

And also that torchvision can be a more popular framework and people may be more comfortable using it

bsridatta commented 3 years ago

Got it, thanks. I shall work on this and maybe also try to do a comparison.

bsridatta commented 3 years ago

@JoanFM It seems like batch processing is not possible in Torch transforms https://github.com/pytorch/vision/issues/157 Should I use @single for the crafter then?

We would need custom implementations of these transforms to enable batch processing, maybe we can do it for the most commonly used transforms - similar to https://github.com/pratogab/batch-transforms

JoanFM commented 3 years ago

Oh, that is a surprise. We can yes maybe start by using single decorator

JoanFM commented 3 years ago

@bsridatta ,

Maybe it would make more sense to have these transformations using tensorflow? It seems that Keras can pipe transformations into a Sequential module, which I guess can handle batches. However, I am not sure if it is just an illusion and inside they are performed one by one

bsridatta commented 3 years ago

@JoanFM I knew we do transforms in get_item and checked the docs. But looks like the batch is only possible for Normalize which takes tensors alone as input. If one does not need crop or resize we can have a crafter that uses Normalize alone, we need to convert input numpy blob to tensor.

https://pytorch.org/vision/stable/transforms.html All transformations accept PIL Image, Tensor Image or batch of Tensor Images as input. Tensor Image is a tensor with (C, H, W) shape, where C is a number of channels, H and W are image height and width. Batch of Tensor Images is a tensor of (B, C, H, W) shape, where B is a number of images in the batch.

Regarding TensorFlow, I wanted to ask you once this crafter was done regarding further optimization. I haven't done this before myself but we can do actually to do that in PyTorch as well .We can replace transforms.Compose() with nn.Sequential() and can serialize this module with JIT (except for lambda and PIL transforms - which are not of out interest anyway). https://pytorch.org/vision/stable/transforms.html#scriptable-transforms

I shall first finish this ImageTorchTransformation so there is a clean way to use torchvision transforms, as it is required either way.

JoanFM commented 3 years ago

Hey @bsridatta.

I think the most interesting ones are Normalize and Resize so that they can prepare a batch of images to be handled by an Encoder.

It is not so clear in the documentation if theese 2 allow batches, the comment above seems to suggest so, but the param documentation seems to be valid for a single image.

bsridatta commented 3 years ago

Hello @JoanFM Well, you are right, it is possible to batch on most of them, I tried it now. I had ToTensor transform as mandatory that do not work on batches which I replaced with torch.from_numpy() and it seems to work. Sorry about the confusion 😅 maybe there is still a catch somewhere. I shall write tests for the common transforms to make sure they work with batches. I am honestly confused about the whole discussion in forums that it won't work on batches, hopefully, no other surprises.

JoanFM commented 3 years ago

Hey @bsridatta ,

For what u have shared, it is rather confusing, so yes, let's have some good tests to ensure this.