Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.55k stars 242 forks source link

Some question of the proposed Sythus #202

Open youthhoo opened 1 year ago

youthhoo commented 1 year ago

The sentences "During the cold-start stage, in-context examples are collected by prompting ChatGPT solely through system messages and visual annotations, employing a heuristic approach. This stage concludes only when satisfactory in-context examples are identified." in Sec.3.2 mentions a key point of Sythus, i.e., a heuristic approach. Could more details of the heuristic approach can be described? What's more, I wonder how to judge if in-context examples are satisfactory?Judge manually or automatically?

Luodian commented 1 year ago

Actually the satisficatory examples are task by task. We don't have a concrete rule, but we invovle human expert to spot them from ChatGPT's generation.

This section means you need to iterate the generation process and select some "identified as good" generated examples as in-context examples (sometimes you may manually change them to make them better).

This process could take 2-3 iterations. In first iteration we usually generate 100 examples, and 500 for second and 1k for last round.

If the final 1k examples are good. Then the system message and incontex examples are fixed for rest of generation.

youthhoo commented 1 year ago

Very thanks for your reply, I further want to know how many examples identified as good can conclude that the final 1k examples are good? Whether a threshold is set, 80%, 70% or others?

Luodian commented 1 year ago

We need to make all 1k examples are without any misinformation or bad pattern (sometimes the answers would be like "according to description xxxx", it's not desired. We need to let it be "according to the observation", that would be a correct generated answer). In this step, we need to make sure 100% correct for 1k examples.

As for "good" (the questions and answers are non-trivial, the conv style matches our expectation), we don't have the threshold. We have 2 students take charge of 1k examples to make sure them are "good".

xjtupanda commented 1 year ago

Actually the satisficatory examples are task by task. We don't have a concrete rule, but we invovle human expert to spot them from ChatGPT's generation.

This section means you need to iterate the generation process and select some "identified as good" generated examples as in-context examples (sometimes you may manually change them to make them better).

This process could take 2-3 iterations. In first iteration we usually generate 100 examples, and 500 for second and 1k for last round.

If the final 1k examples are good. Then the system message and incontex examples are fixed for rest of generation.

@Luodian Do you mean the in-context examples grows from 3 hand-written ones, then iteratively to 100, 500 and 1k automatically generated samples? Each round we should replace the in-context samples in the prompts folder? Thank you!

Luodian commented 1 year ago

The in-context examples are initially handwritten ones and then will be select from the good generated examples (but they still will be 3-4 in-context examples).