Questions about YouTubeVOS training

Hi! Thanks for your great job. I have three questions on YouTube-VOS dataset.

Do you use training videos from YouTubeVOS-2018 or YouTubeVOS-2019?
Do you train the model with full frames or sampled frames?
You mentioned in issue 6 that you use random resized and crop for data augmentation. For a given input frame (most are 720x1280), resize the short side in a random length from 384 to original length (720), then resize the long side to keep frame aspect. Then randomly crop a (384 x 384) area. You also apply different zoom ratios from 0.9 to 1.1 on height and width independently. Correct me if I am wrong. I wonder whether such procedure is equivalent to the RandomResizedCrop function in torchvision.

seoungwugoh / STM