libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

Same transform on image and targets? #305

Open fuji2021 opened 1 year ago

fuji2021 commented 1 year ago

I was looking for ways to apply identical transforms to image and targets but could not find documentation or a tutorial on it. I came across a related issue https://github.com/libffcv/ffcv/issues/86 and it seems to have been addressed but I'm not sure what's the recommended way to use it in practice?

Would appreciate any suggestions!

andrewilyas commented 1 year ago

Hi @fuji2021 ! You are correct that this is now possible in FFCV, although we are completely missing any documentation on how to use it (we're all a bit swamped right now but will hopefully get to it soon -- also accepting pull requests for docs as per #295 ).

In the meantime, hopefully this minimal example helps? I believe the API looks something like this: https://github.com/libffcv/ffcv/issues/77#issuecomment-1030250507

fuji2021 commented 1 year ago

@andrewilyas Thanks for the reply! I was trying to look at the changes from #86 but couldn't tell from the comments or the test-cases about how to use the new feature. But the comment you linked to has an example in which you define a common pipeline first, and then apply it to two different images. Would that apply the common transform (even when they're random transforms) to both images? I'll be happy add to the docs once I understand this right.

andrewilyas commented 1 year ago

@fuji2021 I may have misunderstood your intended application - do you want the exact same random transformation to be applied to the image and the target (including randomness)? Can you provide more details on your use case?

fuji2021 commented 1 year ago

@andrewilyas My use case is exactly as in #86 : both the image and the label go through the same transform (incl. random transform) for a dense prediction task. An example is image segmentation. You apply random horizontal/vertical flip to both the image and the label. You either flip both of them, or you don't. You also apply other transforms like brightness, contrast, etc.

fuji2021 commented 1 year ago

@andrewilyas please let me know if you need more details about the question. Thanks for your time!

andrewilyas commented 1 year ago

Hi @fuji2021 ! I will dig around for a code sample and try to get back to you asap.

jufrick commented 1 year ago

Hi @andrewilyas , do you have any update on this? I have a similar dataset as @fuji2021 and am thus also interested in applying the same randomness for both pipelines. Thanks for your work and time!

jufrick commented 1 year ago

@andrewilyas My use case is exactly as in #86 : both the image and the label go through the same transform (incl. random transform) for a dense prediction task. An example is image segmentation. You apply random horizontal/vertical flip to both the image and the label. You either flip both of them, or you don't. You also apply other transforms like brightness, contrast, etc.

Hi @andrewilyas, I am still unsure how to properly perform the same randomness/transform for image and target (say an image and a depth map) when performing any random transforms for augmentation. Do you by any chance have an example to check out? Thank you for the great work here!

IlyaOvodov commented 7 months ago

Hello! Is there any news about this issue? We ran into this problem trying to apply FFCV not only to object detection and segmentation tasks, but even to classification task, if flipping an image must change its label. If you have any idea how to improve FFCV to make it possible to apply random transforms synchronously to image and label, I can implement it myself and make a PR.

andrewilyas commented 7 months ago

Hi everyone! For some cases, I think the PipelineSpec API in https://github.com/libffcv/ffcv/issues/77#issuecomment-1030250507 will do the trick, although sadly it is still super undocumented (our fault).

Until we get more documentation/clarity on this issue though, one workaround is to simply use a with_indices transformation (https://docs.ffcv.io/ffcv_examples/transform_with_inds.html) and then seed the numba RNG with a hash of the indices (a very basic version of this is done in the mixup augmentation, where instead of a hash we just use the last index in the batch: https://github.com/libffcv/ffcv/blob/6c3be0cabf1485aa2b6945769dbd1c2d12e8faa7/ffcv/transforms/mixup.py#L40) Hopefully this answers some questions.

jufrick commented 7 months ago

Hi @IlyaOvodov, Check out how a team at Facebook research has extended this great library to work better for tasks like yours; they should have your use case implemented with the same transform/randomness on image and target/label.

See here.