Sec.3.1 shows that "In terms of SR, we utilize bicubic interpolation to obtain low-resolution images. As for denoising and deraining, Gaussian noises (on RGB space) and rain streaks are directly added to the clean images."
Yes. Sec.3.1 shows that "The fine-tuning is performed on a single task, where the model is initialized with the pre-trained task-specific encoder and decoder as well as the shared transformer body."
I wonder how to pretrain on ImageNet dataset?