issues
search
KhoomeiK
/
LlamaGym
Fine-tune LLM agents with online reinforcement learning
MIT License
994
stars
44
forks
source link
Fix batching, data formatting, action extraction, prompts; add wandb logging
#1
Closed
KhoomeiK
closed
8 months ago