Few Questions regarding training on custom data

rono221 commented 1 year ago

Hi, I am trying to train OneFormer on the custom dataset and I was able to start the training. But, I have a few questions regarding choosing the right settings. Currently I resued ADE20k config file after editing the number of classes, iterations, and batch size.

1) What does DETECTIONS_PER_IMAGE do and how to choose the right value? 2) How to choose the right crop size? and will it impact the training or prediction time? 3) I have 20k labeled images and I am training on 4 NVIDIA A100 40GB GPUs with batch size 4, what is the minimum number of iterations required to get good results?

rono221 commented 1 year ago

@praeclarumjj3 could you please help

praeclarumjj3 commented 1 year ago

Hi @rono221, thanks for your interest in our work. To answer your questions:

DETECTIONS_PER_IMAGE is the top k queries considered from the total number of queries during the inference stage for the final instance segmentation predictions. Usually, it's fine to set it equal to NUM_QUERIES. https://github.com/SHI-Labs/OneFormer/blob/761189909f392a110a4ead574d85ed3a17fbc8a7/oneformer/oneformer_model.py#L214 https://github.com/SHI-Labs/OneFormer/blob/761189909f392a110a4ead574d85ed3a17fbc8a7/oneformer/oneformer_model.py#L445
The right crop size depends on your use case. If, during the inference, you want to input high-resolution or low-resolution images, it's beneficial to train with a comparable resolution. The crop size is only used during training, so training with a larger resolution will take more time.
It depends on the number of classes in your dataset as well, but since the size is similar to ADE20K, I would recommend training for 160k iterations to establish a baseline.

praeclarumjj3 commented 1 year ago

Closing this, feel free to re-open.

SHI-Labs / OneFormer

Few Questions regarding training on custom data #40