uncbiag / SimpleClick

SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)
MIT License
214 stars 33 forks source link

instance_loss_k & instance_loss_m #17

Closed che1007 closed 7 months ago

che1007 commented 1 year ago

I would like to ask the training loss. It is normal that instance_loss_k and instance_loss_m increase when training. image

qinliuliuqin commented 1 year ago

Yes, it's normal. The two metrics should start from 0. Checkout here for details.

che1007 commented 1 year ago

Is you instance loss keep increasing when training ? image I just modify batch size to 8.

che1007 commented 1 year ago

Though the instance loss keeps increasing, iou keep increasing. Is it normal?

qinliuliuqin commented 1 year ago

Actually, I didn't monitor these intermediate metrics as they are not involved in the loss backpropagation. I may take some time to read the code details as they are inherited from RITM. How about your testing performance?

che1007 commented 1 year ago

I trained the model, sbd_plainvit_base448, on single 2080ti, and just modified the batch size to 8. Train 55 epochs. image

qinliuliuqin commented 1 year ago

It's comparable to my results as shown in Tab. 2 of the paper. My batch size for ViT-B is 140 and I used 4 A6000 GPUs. Since your batch size is 8, your training iterations are 140/8 times higher than mine.

che1007 commented 1 year ago

In your comment, I need to train more epochs than you.

qinliuliuqin commented 1 year ago

No, I didn't mean that. I meant your training iterations were much higher than mine but got slightly worse results. We all trained the ViT-B model with 55 epochs but used different training batches (140 vs. 8). Therefore, we have different training iterations (N55/140 vs. N55/8, where N is the number of training images).

che1007 commented 1 year ago

Is it possible to get the similar performance that shows on the paper with training on single 2080ti?

qinliuliuqin commented 1 year ago

Your results are very close to mine given the differences in our training environments.