Fix batching, data formatting, action extraction, prompts; add wandb logging - Githubissues

KhoomeiK / LlamaGym

Fine-tune LLM agents with online reinforcement learning

MIT License

994 stars 44 forks source link

Fix batching, data formatting, action extraction, prompts; add wandb logging #1

Closed KhoomeiK closed 8 months ago