training data augmentations

Thank you for your work! It seems that data augmentations play a significant role in achieving low EPE on validation datasets. Could you please give more details regarding the data augmentations used to train PWC-Net on FlyingChairs? Specifically, I'm interested in the following:

Do you apply all augmentations to all training examples or you apply each augmentation with some probability?
Do you use the augmentations in specific order or shuffle them each time? For instance, the order of scaling and translation augmentations is important, as it changes the flow in different ranges (larger range if translation is applied first).
The paper states "we apply the same strong geometric transformation to both images of a pair, but additionally a smaller relative transformation between the two images." What augmentations are used for a "smaller relative transformation"? Only the geometric ones or the color augmentations as well?
Is there a point to apply the same translation to both images if all it effectively does is cropping the images?

NVlabs / PWC-Net

training data augmentations #132