llm-evaluation Search Results

1000+ results
for llm-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

znzjugod/live-old-learn-old #3

rag

1. llmware-ai/[llmware](https://github.com/llmware-ai/llmware): Unified framework for building enterprise RAG pipelines with small, specialized models (github.com) 2. https:[/](https://github.com/ll…

znzjugod updated 1 month ago
1
castorini/ura-projects #36

Repro: Synthetic multilingual retrieval models on MIRACL

We have a new project involving multilingual retrieval and reproduction and we are looking for 2 URA students to work together. Feel free to reach out on Slack or email us at nandant@gmail.com, xzh…

thakur-nandan updated 3 months ago
1
dotnet/ai-samples #36

Provide a model/prompt evaluation building block for .NET de…

Today, the [Python Evaluation building block](https://aka.ms/azai/eval) can be used against a .NET backend that uses the Chat Protocol (Azure Search supports this). However, we know from customer feed…

JakeRadMSFT updated 3 months ago
2
run-llama/llama_index #13063

[Question]: Evaluating correctness of my RAG solution

### Question Validation - [X] I have searched both the documentation and discord for an answer. ### Question I am trying to use the built-in capabilities of llamaindex to evaluate the correctness o…

nshern updated 2 months ago
1
pprp/Pruner-Zero #1

Details of LoRA of pruned models.

Great work and thanks for the codebase! I want to know the exact detailed of LoRA fine-tuning as mentioned in Table 6 of the main paper. Also if you could point-out to the bash script to reproduce…

Arnav0400 updated 4 weeks ago
6
vllm-project/vllm #2580

RuntimeError on ROCm

Example of command: ```python benchmark_throughput.py --model gpt2 --input-len 256 --output-len 256``` Output: ```Namespace(backend='vllm', dataset=None, input_len=256, output_len=256, model='gpt…

rlrs updated 4 days ago
7
deepset-ai/haystack #7718

Add support for llama.cpp llm evaluator

**Is your feature request related to a problem? Please describe.** As of now, Haystack's evaluators which extend LLMEvaluator only support OpenAI. I would like for support through llama.cpp to be add…

lbux updated 3 weeks ago
1
rladmstn1714/CLIcK #3

Inquiry about Dataset Split

Hello, Thank you for sharing such an excellent dataset. The evaluation of Korean models is always a challenging topic, and the information you have provided is greatly beneficial for the develop…

taeminlee updated 1 month ago
3
gregor-ge/mBLIP #14

Training the bloomz model

Thank you again for your excellent work. I have trained a model mT0 using my own dataset, and it performs well. Now, I am attempting to train bloomz model, but I'm encountering an issue where the trai…

bexxnaz updated 1 week ago
8
web-arena-x/webarena #134

Overuse of exact matches in eval.

I would like to re-open issue #104 There's an overuse of exact matches in the eval harness. For example, consider task 649: ``` "intent": "Post in history subreddit about what could diffusion…

afourney updated 1 month ago
8

上一页 1...10 11 12 13 14 15 16...100 下一页

1000+ results for llm-evaluation

1000+ results
for llm-evaluation