tingxueronghua / ChartLlama-code

MIT License
190 stars 19 forks source link

可以公布以下具体训练细节吗? #11

Closed bingwork closed 9 months ago

bingwork commented 9 months ago

您好! 有些训练的具体细节,看了两三遍论文,还是没有理解,还望可行的话,能够说的具体一些。 1,请问具体用了什么数据训练?是只用了ChartLlama dataset吗? 2,请问训练数据具体训练了哪个阶段?只是做了stage 2的visual instruction tuning ,基于lora的训练吗?还是stage 1 也做了训练? 3,请问训练epoch是多少?是一个,还是三个呢? 多谢! @tingxueronghua

tingxueronghua commented 9 months ago

你好!

  1. 我们只用了ChartLlama
  2. 可以参考llava的官方实现:https://github.com/haotian-liu/LLaVA/blob/main/scripts/v1_5/finetune_task_lora.sh 只不过我们在训练的时候,llava-1.5官方还没有支持类似的功能,所以是自己实现的,事后我根据git commit的记录确认了我们的修改是全面且正确的。
  3. 3个epoch。不过最早的时候llava的代码里似乎有一点小问题,按照最新的脚本大概训一个epoch就够了。

Hello!

  1. We only used ChartLlama.
  2. You can refer to the official implementation of Llava: https://github.com/haotian-liu/LLaVA/blob/main/scripts/v1_5/finetune_task_lora.sh. However, when we were training, Llava-1.5 officially did not support similar functions, so we implemented it ourselves. Afterward, I confirmed through the git commit records that our modifications were comprehensive and correct.
  3. 3 epochs. However, there seemed to be a small issue in the Llava code in the early stages. According to the latest script, training for about one epoch should be enough.