issues
search
cooelf
/
Auto-GUI
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
https://arxiv.org/abs/2309.11436
Apache License 2.0
160
stars
14
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
test
#17
SnowKnth
opened
2 weeks ago
0
grad_norm=nan, loss=0.0
#16
xukefaker
closed
3 weeks ago
1
请问用于提取图片特征的BLIP2模型是哪个版本?
#15
xukefaker
closed
3 weeks ago
1
arxiv论文可能拼写错误?
#14
1270645409
opened
1 month ago
1
Incorrect Scroll map
#13
neerajanand321
closed
3 weeks ago
1
How to deploy this model on sagemaker?
#12
kirtishrinkhala
opened
6 months ago
0
Unable to run the model
#11
kirtishrinkhala
opened
6 months ago
1
The loss of the base model during training does not decrease
#10
shimurenhlq
opened
7 months ago
3
What's the blip-2 feature extractor details?
#9
truebit
closed
8 months ago
1
any inference code or something to check the model
#8
Occupying-Mars
opened
8 months ago
6
Any demo to inference with mobile screenshot with prompt?
#7
truebit
closed
8 months ago
2
click accuracy and scroll accuracy
#6
njucckevin
opened
9 months ago
4
Inquiry about the train/dev/test split
#5
Yangyi-Chen
closed
9 months ago
2
Incompatible action space with AitW
#4
Jiayi-Pan
closed
10 months ago
3
Hello Great work Guys!
#3
AsadMir10
closed
10 months ago
4
在提供链接的blip数据集并未找到用于inference的single_parsed_episode_t5_blip数据
#2
jiayolo
opened
10 months ago
1
Link for Dataset and trained models is not working.
#1
Asaad-Pak
opened
10 months ago
5