MaverickRen / PixelLM

PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.
Apache License 2.0
177 stars 5 forks source link

Inconsistent annotation in MUSE train #6

Open HoYinTam opened 8 months ago

HoYinTam commented 8 months ago

Very grateful to open source such a wonderful work!

When I was training using "multi_reason_seg", I found "gt_mask.shape[0] != pred_mask.shape[0]" in some cases.

For example,

image

There are only 2 categories and 2 masks in the convesarion, "teddy bear" and "log".

image

But there are 2 categories and 3 masks in the annotation, 1 for "teddy bear" and 2 for "log".

Therefore, "gt_mask.shape[0] != pred_mask.shape[0]". I had to comment out https://github.com/MaverickRen/PixelLM/blob/main/model/PixelLM.py#L586 to make things work.

Is there any fix for this situation? Or is there any update plan for the MUSE dataset?

Thanks in advance

MaverickRen commented 8 months ago

I'm sorry, I will find out the bugs in this version of the published data as soon as possible

jdg900 commented 7 months ago

I'm sorry, I will find out the bugs in this version of the published data as soon as possible

Did you upload the updated version of the data?

MaverickRen commented 7 months ago

Sorry, I figure out what the problem is. There's a very small number of question-answer pairs where the text content doesn't align with the number of answer labels. I fix this issue in the data loading code.

GaoXiaoshan commented 4 months ago

Have you solved the problem? I met this problem even with the updated code.