Open Lyman-Smoker opened 2 weeks ago
Hello, thanks for your interests and suggestions.
We choose to retain the image resolution during training. In such setting, batch size larger than 1 is complex, since we have to combine visual feature maps with different resolutions.
Currently, the training process costs around 24 hours with 8 A6000 GPUs, which is still acceptable.
We will consider setting batch size larger than 1 in our next release.
In current implementation, the batch size for quality tasks must be 1 during training.
However, it seems that training with batch_size_per_gpu=1 is not fast enough. Is there any solution for this problem?