mmlu Search Results - Githubissues

1000+ results
for mmlu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Linear95/SPAG #5

Extremely unbalanced attacker defender winrate and strange b…

In the `gpt4_game_top30k_results.json` file, there are 20067 attacker win samples and 3287 defender win samples, with att/def ~ 6.1 While after SFTed the model using ``` torchrun --nproc_per_node…

thwu1 updated 4 months ago
19
OpenNMT/CTranslate2 #1735

Add gemma2 support

It should be a minor extension It shares gemma and llama features And also a powerful model by itself MMLU 71 [paper](https://storage.googleapis.com/deepmind-media/gemma/gemma-2-report.pdf) …

Theodotus1243 updated 1 week ago
2
irthomasthomas/undecidability #665

Best way to add knowledge to a llm : r/LocalLLaMA

- [ ] [Best way to add knowledge to a llm : r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/1ao2bzu/best_way_to_add_knowledge_to_a_llm/) # Best way to add knowledge to a LLM: r/LocalLLaMA…

irthomasthomas updated 8 months ago
1
instructlab/eval #68

mmlu isn't consuming multiple gpus

Steps to recreate: Launch mmlu on an instance with multiple gpus. Run: ilab model evaluate --model models/instructlab/granite-7b-lab --benchmark mmlu Only 1 gpu is consumed. Adjusting batch…

danmcp updated 1 month ago
1
zjwang21/MoE-LPR #2

Details of Evaluation

XZhang00 updated 1 week ago
2
vllm-project/vllm #5234

[Feature]: Add efficient interface for evaluating probabilit…

# Proposed Feature Add an efficient interface for generation probabilities on fixed prompt and completion pairs. For example: ```python # ... load LLM or engine prompt_completion_pairs = [ …

xinyangz updated 1 week ago
2
jquesnelle/yarn #19

Yarn gets worse results than NTK-aware-scaling policy, unde…

## model info * base-model : baichuan-7b * base-context-size : 4096 Did this phenomenon oberserved in your experiments? In short context-window: Ntk > Yarn ![image](https://github.com/jquesn…

mmmans updated 10 months ago
8
QwenLM/Qwen #1189

[HELP] I wonder how the MMLU result is evaluated?

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

YuMeng2v updated 1 day ago
3
instructlab/instructlab #2124

Print full scores along with delta for MMLU-Branch scores

Currently an MMLU-Run has the following output. ``` # KNOWLEDGE EVALUATION REPORT ## BASE MODEL /home/ec2-user/.cache/instructlab/models/instructlab/granite-7b-lab ## MODEL /home/ec2-user/…

alimaredia updated 1 month ago
3
mlc-ai/mlc-llm #2992

[Bug] Concurrent requests are being run sequentially on AMD …

## 🐛 Bug Hello team, Thanks for creating such an amazing engine. I ran Llama-3-8B-Instruct-q4f16_1-MLC in server mode with different batch sizes (2-128) but I still see my requests are being run …

Said-Akbar updated 2 weeks ago
1

上一页 1...24 25 26 27 28 29 30...100 下一页

1000+ results for mmlu

1000+ results
for mmlu