NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.14k stars 620 forks source link

How to do multi-cropping in images or in image sequences? #1579

Closed yifanjiang19 closed 3 years ago

yifanjiang19 commented 4 years ago

Normally we will do multi-cropping like 10-cropping and calculate the average accuracy in these 10 cropped images. How to do it in DALI? And if we have a 32 frames image sequences, how to do temporal 10 cropping?

Thanks!

JanuszL commented 4 years ago

Hi, You can try to use the Slice operator and generate 10 different crops for each batch of the sequence data. Code may look like this:

    self.slices = [nvidia.dali.ops.Slice(normalized_anchor=true, normalized_shape=true, ...) for _ in range(10)]
    self.anchors = [nvidia.dali.ops.ExternalSource() for _ in range(10)]
    self.shapes = [nvidia.dali.ops.ExternalSource() for _ in range(10)]

def define_graph():
    (...)
    out = []
    for i in range(10):
        feed_input(self.anchors[i], random_anchors)
        feed_input(self.shapes[i], shape)
        out.append(self.slices(sequence_data, self.anchors[i], self.shapes[i]))
    return out
wangyuyue commented 1 month ago

Hi, I want to know whether it is possible to get different number of crops for different images in a batch. For example, two non-overlapped crops for img1, and one crop for img2. Does this require I manually duplicate image1 data?

Thanks

mzient commented 1 month ago

@wangyuyue It's possible to abuse fn.warp_affine to broadcast a single image as multiple "frames" and produce a sequence. The warping matrix can be used to describe multiple crop windows. However, all crops would need to be of the same size. If cropping is followed by some other (affine) transforms that in the end have a common size, then you can fuse those operations (provided that you don't need aggressive antialiasing and sophisticated resampling methods offered only by fn.resize).

mzient commented 1 month ago

Hello @wangyuyue I've rolled an example of multi-crop with warp_affine - one sub-example produces constant-sized crops, the other produces variably-sized crops and resizes them to a fixed size.

See https://github.com/NVIDIA/DALI/blob/614b64998795f32cb90ce20bf21b88617f99c1fc/docs/examples/image_processing/multiple_crops.ipynb

wangyuyue commented 3 weeks ago

Hi @mzient , Thanks for your reply. I wonder if it is possible to do multi-crop while keeping the original sizes of each crop? Or is it possible via any customized operator?

Thanks

wangyuyue commented 3 weeks ago

And I have another question. If I perform the multi-cropping by duplicate the image file names at fn.readers.file, will the same file be read twice from the storage IO? Or is it cached in memory and reused for the second read? If it won't trigger storage IO multiple times, I might adopt this solution. Thanks again for your attention, @mzient

mzient commented 2 weeks ago

@wangyuyue Sorry for the delay.

Thanks for your reply. I wonder if it is possible to do multi-crop while keeping the original sizes of each crop? Or is it possible via any customized operator?

It's not possible if you want to keep the crops in one "sample". Unrolling these crops into a larger batch isn't currently possible, either.

If I perform the multi-cropping by duplicate the image file names at fn.readers.file, will the same file be read twice from the storage IO? Or is it cached in memory and reused for the second read?

I don't think we do any more caching than the operating system provides (but it's a good idea that we check it and perhaps add an optimization for the case where a sample appears many times in a batch). Otherwise, you can read the files in Python and feed them through fn.external_source.