Experiment on Cambiran7m with system prompt and Minigemini data, former got even worse result

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

https://cambrian-mllm.github.io/

Apache License 2.0

1.75k stars 113 forks source link

Experiment on Cambiran7m with system prompt and Minigemini data, former got even worse result #51

Open MonolithFoundation opened 3 months ago

MonolithFoundation commented 3 months ago

Follow the same model setup and training steps, but different only on data, the Cambrian7M with system rpompt data got bad result (I mean very bad, the model almost failed to talk, reasonbility also bad, all metric are collapsed)

Any reason for this?

lxtGH commented 2 months ago

@MonolithFoundation Hi! I get the same problems. In particular, on MMbench, MME, SEEDBench, ScienceQA, AI2D datasets.

MonolithFoundation commented 2 months ago

Hello, we have selectively extracted several subsets from it. Currently, it can provide a certain degree of benefit. However, the entire distribution is entirely attributed to Cambrian claims. Nevertheless, large amounts of data can enhance performance to a certain extent, but the training time is excessively long.