sentient-engineering / agent-q

agent q - oss advanced reasoning and learning for autonomous ai agents
MIT License
322 stars 69 forks source link

Implemented Monte Carlo Search for generating DPO pairs #1

Closed thebhulawat closed 1 month ago

thebhulawat commented 1 month ago

Implemented Monte Carlo Search for generating DPO pairs