xuebinqin / DIS

This is the repo for our new project Highly Accurate Dichotomous Image Segmentation
Apache License 2.0
2.11k stars 244 forks source link

Training and fine-tuning suggestions #88

Open przemb opened 9 months ago

przemb commented 9 months ago

Hello, @xuebinqin, thanks for open sourcing your work. Could you please share some training tips and answer below questions?

I am using default configuration:

hypar["early_stop"] = 20
hypar["batch_size_train"] = 8 ## batch size for training
hypar["batch_size_valid"] = 1 ## batch size for validation and inferencing

a) training from scratch: Let's suppose that I would like to train a model from scratch using DIS to get the same results as provided in isnet.pth.

  1. What is the approximate number of iterations ite_num needed to achieve the same results?
  2. Do I need to train ISNetGTEncoder?

b) fine-tuning: I would like to fine-tune a model on custom dataset from medical domain.

  1. Which weights do you recommend to use for fine-tuning? (isnet.pth or isnet-general, which has higher performance, but was optimized)
  2. Taking into account that I will switch to other domain, do I need to train ISNetGTEncoder?
chrbruckmann commented 9 months ago

I am also curious of reproducing the work to see if I can use it to its full potential. I can answer part of your question. It should be around 100k iterations since on page 23 in their paper they write: “According to our experiments, the training process of our ground truth encoder is easy to converge, and it usually takes only 1,000 iterations (stop training when the valid maxF is greater than 0.99). While the segmentation component of our model usually converges after around 100k iterations, and the whole training process takes less than 48 hours.”

For reproducing I also wonder, the initial values for the learning rate and optimizer were acording to the paper: "(initial learning rate lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight decay=0)". But it was not mentioned how or when the learning rate was changed. If I don't change it manualy the model stops converging around 15k iterations.