Closed yjhdhr closed 1 month ago
May I ask if the visual module is integrated from siglip-so400m-14-980-flash-attn2-navit? The maximum resolution supported by the original Siglip is 980, why does Minicpmv2.5 only support a single block of 448?
请问视觉模块是从siglip-so400m-14-980-flash-attn2-navit集成而来吗? 原版siglip支持的最大分辨率是980,为什么minicpmv2.5只支持到单块448?
What are the negative effects of using up-to 980 for training/infering? 使用单块最大980的训练/推理有什么负面影响?
跟我们预训练不一致,所以会有一些out of domain现象,可能效果不如直接用448好
May I ask if the visual module is integrated from siglip-so400m-14-980-flash-attn2-navit? The maximum resolution supported by the original Siglip is 980, why does Minicpmv2.5 only support a single block of 448?
请问视觉模块是从siglip-so400m-14-980-flash-attn2-navit集成而来吗? 原版siglip支持的最大分辨率是980,为什么minicpmv2.5只支持到单块448?