issues
search
Yifan-Song793
/
ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
https://arxiv.org/abs/2403.02502
88
stars
9
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
DPO formula question
#9
nighty8
closed
1 month ago
1
Average reward of gpt-3.5-turbo
#8
George-Chia
opened
3 months ago
0
Issue About Constructing Preference Data (Webshop)
#7
xuehui1991
closed
1 month ago
7
How to Run the evaluation?
#6
George-Chia
closed
3 months ago
2
Alfworld 正负样本
#5
swt-user
closed
4 months ago
3
questions about PPO
#4
yananchen1989
closed
4 months ago
2
请问能更新一下expert trajectory吗
#3
hzy312
closed
6 months ago
0
expert trajectories是如何采集的?
#2
Fu-Dayuan
closed
5 months ago
3
Performance with LoRA Finetuning
#1
Yu-Fangxu
closed
6 months ago
1