How to perform identical transformation on images and targets in dense prediction tasks?

shariqfarooq123 commented 2 years ago

In dense prediction (or Pix2Pix) tasks such as segmentation, depth estimation etc, we often need to perform the same transformation on images and targets. e.g. Either both the image and the corresponding target should be horizontally flipped, or neither of them.

How do I achieve that with FFCV?

GuillaumeLeclerc commented 2 years ago

Hello @shariqfarooq123 @Callidior we are currently planning improvements of FFCV to make this kind of task. One suggested potential API from @andrewilyas would be the addition of a transform called SynchronizeSeed(seed_offset=0) that you can insert in both pipelines and that guarantees that forces the seed to a value that is identical (assuming seed_offset is the same).

Is it something that would be useful and solve your use case ? Do you have a suggestion for a neater API that would solve what you need ?

shariqfarooq123 commented 2 years ago

Thanks for the response!

Yes, something like that could be useful and would solve this.

For now, I ended up setting the seed which is dependent on the indices inside the operation, as done in Mixup.

For maximum flexibility, it would be good to have the ability to implement custom "joint" operations / joint pipelines. Such operations would then take the tuple of all the fields e.g. JointOp(imA_at_index_i, imB_at_index_i).

GuillaumeLeclerc commented 2 years ago

@shariqfarooq123 Could provide an example of how this would look like from the point of view of a user. I'm not really clear about that.

Callidior commented 2 years ago

@GuillaumeLeclerc The proposed SynchronizeSeed transform would be a straightforward and easy to implement solution for the problem. However, I see a few potential issues.

In some cases, it would be necessary to insert this transform multiple times into a pipeline, before each transformation to be synchronized. This would be the case when one pipeline contains more transforms than the other, as in the following example:

pipelines = {
    'image': [
        SynchronizeSeed(),
        RandomResizedCropRGBImageDecoder((224, 224)),
        tf.ColorJitter(0.2, 0.3),
        SynchronizeSeed(),
        RandomHorizontalFlip(),
        ToTensor(),
        ToTorchImage(),
        ToDevice(0)
    ],
    'mask': [
        SynchronizeSeed(),
        RandomResizedCropRGBImageDecoder((224, 224)),
        SynchronizeSeed(),
        RandomHorizontalFlip(),
        ToTensor(),
        ToDevice(0)
    ]
}

Here, the synchronization needs to be inserted multiple times, since the image pipeline contains an additional color jitter transform. In this case, this could easily be solved by reordering the transforms, but that doesn't always need to be the case.

Second, I am concerned that repeatedly fixing seeds might have a negative impact on the randomness of the remaining transformations in the pipeline.

I would also be in favor of "joint pipelines" as mentioned by shariqfarooq123 but it would probably require larger restructuring of the library to make joint pipelines a possibility. As a trade-off, I could imagine "stateful transforms", which can share an internal state between pipelines.

Let me illustrate this with an example. In the following, two separate instances of RandomHorizontalFlip are used and they are thus independent:

pipelines = {
    'image': [..., RandomHorizontalFlip(), ...],
    'mask': [..., RandomHorizontalFlip(), ...],
}

My proposal would be to make the behavior different if the same instance of the transform is used in both pipelines:

flip = RandomHorizontalFlip(synchronized=True)
pipelines = {
    'image': [..., flip, ...],
    'mask': [..., flip, ...],
}

In this case, the flip in the two pipelines would be synchronized, as the same instance is used. This instance could maintain an internal state to remember the randomly sampled parameters. The state could be initialized during the first call to the transform from one of the pipelines and then re-used in the other ones. It would, of course, need to be reset after the batch or the epoch (if states depend on sample indices).

I have no idea how easy this would be to implement in FFCV, but it still seems easier than joint pipelines while being more user-friendly and less error-prone than having a SynchronizeSeed transform.

GuillaumeLeclerc commented 2 years ago

Thank you @Callidior for the careful thoughts you put in this issue.

About the SynchonizeSeed there would be an argument so that you can have it multiple times in the pipeline as long as each group uses a different one. Based on my understanding of PRNGs, I don't see how using one synchronization would be worse than multiple.

As @juliustao pointed, there is a broader problem here. When using multiple workers, there is no guarantee that they will process samples in order and use random samples for different elements in the batch across different supposedly synchronized transforms. This could be solved by https://github.com/numba/numba/issues/2649.

I like the idea of joint transformation and I could see multiple potential APIs. One is the one I described with a state and another would be stateless where we pass the two input simultaneously to the transformation. In both cases there are two potentially major problems:

all existing and future have to be modified and are now more complicated to write this raise the barrier for newcomers to both using and contributing to FFCV.
Memory allocation becomes trickier. Before each instance of a transform had it's memory now it more complicated as a variable amount of pipelines use a transform. Also how to handle someone who might want to apply the same flip on potentially different resolution images, now the memory allocated isn't even uniform accross users of a same transformation.

Thoughts ? (Also @andrewilyas @lengstrom )

lengstrom commented 2 years ago

It seems like joint transformations are the only way to ergonomically + efficiently allow for shared state (even if the seed mechanism worked perfectly it feels like an opaque/strange way to share state)

GuillaumeLeclerc commented 2 years ago

@lengstrom Any suggestion on how to solve the issues related to Joint transforms I raised?

Callidior commented 2 years ago

Would it be too difficult to allocate memory per pipeline using the transform? For each pipeline, declare_state_and_memory would be called and dedicated memory be allocated. The destination of the transform is already passed to the function created by generate_code, so this would just be a different destination for each pipeline. Only the memory needed for storing the state would need to be shared across pipelines and could be allocated separately.

Or do I get the memory concept totally wrong?

GuillaumeLeclerc commented 2 years ago

No you are right @Callidior. I guess the only problem is how to make it compatible with all previous and future augmentations

GuillaumeLeclerc commented 2 years ago

@Callidior I'm currently refactoring the codegen of FFCV to allow for joined operations and other features. I hope it's going to be done soon. I still think we also will add SyncSeed to enable simple use cases where a user just want to synchronize two operations without having to write a joined version of it

Callidior commented 2 years ago

@GuillaumeLeclerc Great! I think the possibility of joint transformations that can share state across pipelines will be an important feature for FFCV. Regarding SyncSeed, the concerns with respect to multiple parallel data loading processes still remain, right?

GuillaumeLeclerc commented 2 years ago

This (and more) made possible by 307ffe0cf3702f5c991b9ed5ca12ac314599edb9. I hope we can make a release with it soon :tada:

fuji2021 commented 1 year ago

@GuillaumeLeclerc Is there an example on how to use this new feature? I am also looking to perform identical transforms on images and targets. Thanks!

libffcv / ffcv

How to perform identical transformation on images and targets in dense prediction tasks? #86