llm-eval Search Results

1000+ results
for llm-eval

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #3030

lots of blank before each runing step

I use torch.profiler.profile() to profile mixtral based on vllm. And I found lots of blank before each runing step. ![S85Z22{PW)GZ0(E)4AH4AF1](https://uploads.linear.app/342cff15-f40f-4cf7-8bee-343d2…

Eutenacity updated 3 weeks ago
4
ggerganov/llama.cpp #10252

Bug: CANN: Inference result garbled

### What happened? llama.cpp使用QWen2.5-7b-f16.gg在310P3乱码 ### Name and Version ./build/bin/llama-cli -m Qwen2.5-7b-f16.gguf -p "who are you" -ngl 32 -fa ### What operating system are you seeing the …

feichenchina updated 2 weeks ago
8
promptfoo/promptfoo #2084

Specifying a grader causes assertions to fail with the error…

**Describe the bug** When a provider is explicitly set on `defaultTest` or through the command using `--grader`, some random assertions are failing with the error message "Could not extract JSON from…

mouhcinetao updated 2 weeks ago
2
vllm-project/vllm #9517

[Feature]: google/gemma-2-2b supports 8K context length but …

### Your current environment vllm version: 0.6.3.post1 ### Model Input Dumps _No response_ ### 🐛 Describe the bug I see on the official site of gemma: https://huggingface.co/google/gemma-2b, cont…

yananchen1989 updated 1 month ago
1
explodinggradients/ragas #1309

faithfulness_score: nan

[ ] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question. **Your Question** faithfulness_score: always be nan **Code Examples**…

beatG123 updated 1 month ago
7
gyxxyg/VTG-LLM #25

The result for ActivityNet

Hi, Thank you for sharing your impressive work! Equipping LLMs with temporal understanding is indeed a challenging task. I have a question regarding the ActivityNet results: Are the scores you r…

weiyuan-c updated 1 month ago
4
horseee/LLM-Pruner #58

Evaluation：UnicodeDecodeError: 'utf-8' codec can't decode by…

Thank you very much for doing such great open-source work! i try: CUDA_VISIBLE_DEVICES=X bash scripts/evaluate.sh PATH_OR_NAME_TO_BASE_MODEL PATH_TO_SAVE_TUNE_MODEL PATH_TO_PRUNE_MODEL EPOCHS_YOU…

manlenzzz updated 2 months ago
1
n4ze3m/page-assist #138

[Feature] Give LLM ability to interact with page

It would be neat to give the LLM the ability to interact with the current web page. User should be able to describe some page interaction and the LLM executes it. This is likely some combination of …

jmfirth-arkane updated 1 week ago
4
Dongyu-Jia/LLM-RAG-Experiment #3

Evaluator setup: E2E + Retrieval correctness

goorui updated 1 day ago
1
EleutherAI/lm-evaluation-harness #2233

TODOs for Implementing LLM-as-a-Judge in Eval-Harness (Work …

@haileyschoelkopf @lintangsutawika @baberabb The following is a list of TODOs to implement LLM-as-a-Judge in Eval-Harness: **TLDR** * Splits existing `evaluate` function into `classification_e…

SeungoneKim updated 4 weeks ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for llm-eval

1000+ results
for llm-eval