mseg-dataset / mseg-semantic

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"
MIT License
460 stars 78 forks source link

How to train the model? #27

Closed lxtGH closed 3 years ago

lxtGH commented 3 years ago

Hi! @johnwlambert I checkout the training branch. I can not find any instructions for training. I wonder how to train the HR model in Tab2 of your paper.

Also where is train-qvga-mix-copy.sh? where is train-qvga-mix-cd.sh ?

It is very confusing.

lxtGH commented 3 years ago

Also, I found each GPU hold 2 images during training for each dataset. How to balance the dataset sample for each node since each dataset has variant size?

caiusdebucean commented 3 years ago

Hello, I managed to train the model using _~/mseg-semantic/mseg_semantic/tool/train.py_ .

I have provided a .yaml file as the --config argument (e.g. mseg_3m.yaml). You can modify a bunch of stuff there. By default, as far as I understood, each dataset trains on a separate GPU (check the _dataset_gpumapping entry in the .yaml file. Since the model is distributively trained, the batch size will be split equally among each gpu. I believe that means that the smaller datasets will get to train for more epochs, and bigger ones for less epochs. Maybe this is a solution to combat catastrophic forgetting .

You can also edit the HRNet architecture in _ mseg-semantic/mseg_semantic/model/seg_hrnet.yaml _.

johnwlambert commented 3 years ago

Hi @lxtGH, have you read through the training section of the README? https://github.com/mseg-dataset/mseg-semantic#training-instructions

That will point you to the TRAINING.md page: https://github.com/mseg-dataset/mseg-semantic/blob/master/training.md

johnwlambert commented 3 years ago

You can directly use python ~/mseg-semantic/mseg_semantic/tool/train.py with desired arguments, as @caiusdebucean mentioned. Debucean is correct -- we use DDP, with each dataset on a separate GPU, and gradients are reduced across workers.

johnwlambert commented 3 years ago

Please note that we provide code to train dozens of different models (and we have provided weights for dozens of different trained models), so users may be looking for different sorts of training configs.

Which taxonomy, dataset, and resolution you are looking to train for? You can find more details in our paper.

johnwlambert commented 3 years ago

Our HRNet model in Table 2 is trained using the universal taxonomy, for 1 million crops, at 1080p resolution.

lxtGH commented 3 years ago

Thanks for you reply!
I wonder if all the datasets are merged into one single large dataset. (Small datasets repeat multiple times while large datasets repeat less times) Will the results be different ? @johnwlambert I mean if I concat these dataset by padding(maybe multiply a ratio number) small datasets into one large dataset. Then I can train it with 4 GPUs.

johnwlambert commented 3 years ago

It's still a bit of an open research question about the best way to mix datasets together at training time. We wanted to prevent the large datasets from dominating the domains represented by smaller datasets, and we found our solution already worked well. But concatenating into one big dataset and randomly sampling IDs for minibatches would be something interesting to compare against.

If you concatenate into one large dataset by using multiplicative ratios as if they were each on their own GPU, in expectation the minibatches should have the same ratios, so likely the results would be similar to ours. I cannot guarantee it 100%, but it's likely.