The mIoU under unseen classes performs far below the results reported in the paper

Dear author,

Thank you for sharing your excellent work.

However,when I tried to train ZegCLIP under the setting of vpt_seg_zero_vit-b_512x512_10k_12_10_st, I got a ridiculous result that was far from the results in the original paper. One difference is that I set the batch_size to 8 rather than 16, considering the limitation of GPUs. Simultaneously, I changed the training iterations to 20k. The evaluation results are as below. Could you tell me what caused this? I would appreciate it！

ZiqinZhou66 / ZegCLIP

The mIoU under unseen classes performs far below the results reported in the paper #20