evaluate-llm Search Results

1000+ results
for evaluate-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

fulfulggg/Information-gathering #265

LIME-M：巨大言語モデル評価における「Less Is More」アプローチ

## タイトル: LIME-M：巨大言語モデル評価における「Less Is More」アプローチ ## リンク: https://arxiv.org/abs/2409.06851 ## 概要: マルチモーダル大規模言語モデル（MLLM）の著しい成功に伴い、画像認識タスク（例：画像キャプション生成、画像質問応答）におけるMLLMの能力を評価し、その開発を導くために、数多くのベンチマークが設計…

fulfulggg updated 2 months ago
2
flexflow/FlexFlow #995

How are the benchmarks measured?

I am attempting to use FlexFlow to compare the inference speed to vLLM, but FlexFlow appears to be an order of magnitude slower than vLLM and I've been running into many errors. Testing on a Linux ser…

eugenepentland updated 4 months ago
4
OFA-Sys/AIR-Bench #3

Request for Complete Test Script for Qwen2-Audio on AIR Benc…

Hi, I'm currently trying to replicate the performance of Qwen2-Audio on the AIR Bench. However, I noticed that the repository at [AIR-Bench](https://github.com/OFA-Sys/AIR-Bench/blob/main/score_cha…

whwu95 updated 3 months ago
7
InternLM/xtuner #734

使用自己的数据集训练时报错

运行xtuner train /root/autodl-tmp/ft/config/internlm2_chat_7b_qlora_alpaca_e3_copy.py --work-dir /root/autodl-tmp/ft/train时 `[2024-05-30 17:18:47,089] [INFO] [real_accelerator.py:203:get_accelerator] S…

cococoda updated 6 months ago
2
explodinggradients/ragas #1381

Can't generate testdataset, always connection error and even…

[x] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** LLM is started by ollama, so there's no connection issue and …

KylinMountain updated 1 month ago
20
intel-analytics/ipex-llm #12015

Inference speed and memory usage of Qwen1.5-14b

I have tested the inference speed and memory usage of Qwen1.5-14b on my machine using the example in ipex-llm. The peek cpu usage to load Qwen1.5-14b in 4-bit is about 24GB. The peek GPU usage is abou…

WeiguangHan updated 2 months ago
3
ucbepic/docetl #4

Context-Aware Sampling for Validation Agents

## Objective Develop an intelligent sampling algorithm that can extract representative content from entire documents or collections of documents, ensuring balanced representation in validation prompt…

shreyashankar updated 1 month ago
1
OpenAdaptAI/OpenAdapt #421

Pipeline to evaluate models autoregressively

### Feature request To build a generic script/pipeline which takes input as : - Model name - One or multiple recording Then the pipeline should: - Build prompts from events from recording. …

LaPetiteSouris updated 1 year ago
2
soohoonc/llms #3

testing in llms

add a section about testing llms, this is crucial

soohoonc updated 5 months ago
23
Seeed-Studio/wiki-documents #754

[Real Life Improvement] Develop Large Language Model on NVID…

## Overview - We need your help to deploy a large language model on NVIDIA Jetson devices and allow people can use words to control the connections/interfaces on the board. - This is the preparation…

MatthewJeffson updated 2 months ago
4

上一页 1...83 84 85 86 87 88 89...100 下一页

1000+ results for evaluate-llm

1000+ results
for evaluate-llm