-
This issue is a feature! Paste append the raw text (copy and paste) of any automatically generated polis after this prompt.
![sji-tldr](https://github.com/user-attachments/assets/0d1237bb-1267-481…
-
### Which track are you doing?
DL
### What's the problem you are trying to solve?
You have a material that's hard to understand? A difficult subject at school, a legal contract or a medical docum…
-
Hi! Thanks for the great work on OS-agent. I noticed the evaluation was done by llms automatically. And is there any task can be evaluated without llms, that is stable evaluation.
-
Hello Andrew, thank you for your excellent contributions to the translation agent! I am reaching out to highlight our recent work, "**TEaR: Improving LLM-based Machine Translation with Systematic S…
-
## Overview
We need a way to evaluate how accurate the chatbot workflow is. We will focus on how well the LLM agent is following its coded prompt instructions.
## Details
* Create a small set of test…
-
-
1. llmware-ai/[llmware](https://github.com/llmware-ai/llmware): Unified framework for building enterprise RAG pipelines with small, specialized models (github.com)
2. https:[/](https://github.com/ll…
-
So I'm the co-maintainer of [ragas](https://github.com/explodinggradients/ragas) an evaluation tool for LLM pipelines and a contributor to llama_index, langchain and a few ml tools. We have been seein…
-
* #6964
## Discussion
Before we can start managing the workflow of the agent, we have to give it some more structure. The different ways to implement this can generally be subdivided into 3 groups…
-
If we following the script setting of long-llm, the parameter num_train_epoch is set to 1, it will give out really significant improvment over the original model. However, if we change the paramter to…