Closed Jian-danai closed 3 years ago
We use Tesla V100 with 32 GB memory. For GPUs with less memory, you can adjust batch size or patch size to avoid this issue. I think the patch size of 200 would work as long as the size is multiples of 4
.
We use Tesla V100 with 32 GB memory. For GPUs with less memory, you can adjust batch size or patch size to avoid this issue. I think the patch size of 200 would work as long as the size is multiples of
4
.
Will adjusting batch size or patch size affect the performance of the final model? Thanks.
From my experience, patch size and batch size can influence the final performance, but it won't be significant since we have no batch norm layer in our network. You can refer to Sec. 4 in the supplementary file of ESRGAN.
From my experience, patch size and batch size can influence the final performance, but it won't be significant since we have no batch norm layer in our network. You can refer to Sec. 4 in the supplementary file of ESRGAN.
Hi I noticed that your total_epoch=(total_iteration*batch_size_per_gpu*world_size)/(train_set_len*dataset_enlarge_ratio), where world_size=num_gpu. In your default setting, total_iteration=1000000, batch_size_per_gpu=8, world_size=2, dataset_enlarge_ratio=20, and train_set_len (18144) will always remains the same if I use the data you provided. If I only 1 GPU for training, then it seems that I should double the batch_size or double the epoch to keep the total_epoch remain the same?
There is a subtle difference between iteration-based and epoch-based training. Iteration determines the number of updates, while epoch means the times of whole dataset to be used. Based on our experience, setting the number of epochs to train won't affect a lot, since the sizes of dataset for low-level vision are commonly small (1k~10k). Anyway, it depends on your own preference. If you focus more on total epoch, just double the batch size or the number of training iterations.
There is a subtle difference between iteration-based and epoch-based training. Iteration determines the number of updates, while epoch means the times of whole dataset to be used. Based on our experience, setting the number of epochs to train won't affect a lot, since the sizes of dataset for low-level vision are commonly small (1k~10k). Anyway, it depends on your own preference. If you focus more on total epoch, just double the batch size or the number of training iterations.
Thank you.
Hi, what GPU do you use in training? Do you have any suggestions for this issue?
Hi, I changed the gt_size (DISCNet_train.yml) from 256 to 200, it seems the training code can work. But I am not sure whether such a modification is suggested?