densecap任务 - Githubissues

Open3DA / LL3DA

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

https://ll3da.github.io/

MIT License

248 stars 9 forks source link

densecap任务 #24

Closed xjj1999 closed 3 months ago

xjj1999 commented 3 months ago

请问在训练过程中，作者有碰到对于densecap任务不同的val样例，llm输出个固定格式，内容相似的答案的现象吗

ch3cook-fdu commented 3 months ago

I do not quite understand your question, could you provide me with some examples?

xjj1999 commented 3 months ago

20240725-185406

xjj1999 commented 3 months ago

比如说训练了10个epoch之后，验证集结果如图，llm对不同问题的输出即response_pred基本一致。这里只截取了部分，几乎整个验证集的输出答案都是这个格式和内容。

ch3cook-fdu commented 3 months ago

If you are training with ScanRefer data only, it might be normal.

xjj1999 commented 3 months ago

谢谢！我确实是只用了scanrefer数据集。我尝试下多数据集联合训练

xjj1999 commented 3 months ago

May I ask if this phenomenon disappears naturally during joint training on multiple datasets?

ch3cook-fdu commented 3 months ago

If you feed diverse data on more tasks, this might be alleviated.

xjj1999 commented 3 months ago

请问方便提供densecap任务训练过程中验证集的评测指标变化曲线吗？

ch3cook-fdu commented 3 months ago

scanrefer-opt-1.3b-logger.log Here is the log for ScanRefer fine-tuning.

xjj1999 commented 3 months ago

请问unified_scanrefer是如何实现的，从日志上看batch=16,一个epoch为27504，远大于scanrefer数据集的数据量

ch3cook-fdu commented 3 months ago

27504 is the total iterations for 12 epochs.

xjj1999 commented 3 months ago

感谢！

xjj1999 commented 3 months ago

还有个问题是从日志上看，模型在一阶段训练时在densecap上就有了不错的性能，请问方便提供一阶段训练时的验证集指标吗？或者说需要大概多少次迭代，模型能具备相关能力。

ch3cook-fdu commented 3 months ago

We show the corresponding metrics in Table 5 of our paper. Please follow the pre-training guide provided in Readme.

xjj1999 commented 3 months ago

感谢回答，问题已经解决了。我想进一步复现补充材料中Open-Vocabulary实验，请问方便提供ovdet微调后的权重吗