cooelf / Auto-GUI

Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
https://arxiv.org/abs/2309.11436
Apache License 2.0
185 stars 15 forks source link

请问用于提取图片特征的BLIP2模型是哪个版本? #15

Closed xukefaker closed 3 months ago

xukefaker commented 3 months ago

paper里面描述的是blip2_t5_instruct,但是code里面默认的是blip2-opt-2.7b。

cooelf commented 3 months ago

你好。感谢指出。经检查是在脚本的版本迭代过程中,描述基于过时的信息。请以最新代码为准,我们会更新文章相应部分。谢谢!