gpt-evaluation Search Results

1000+ results
for gpt-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mlcommons/inference #1385

GPT-J: evaluation.py is not deterministic

We found that [evaluation.py](https://github.com/mlcommons/inference/blob/master/language/gpt-j/evaluation.py) is not deterministic. I narrowed down to small and fast reproducer using 100 examples …

szutenberg updated 1 year ago
4
huggingface/transformers #28908

Add MistralForQuestionAnswering

### Feature request Add a MistralForQuestionAnswering class to the [modeling_mistral.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/mistral/modeling_mistral.py) so …

nakranivaibhav updated 2 months ago
1
nlpyang/geval #8

How is the "Auto CoT" prompt defined?

G-Eval includes "Auto Chain-of-Thoughts for NLG Evaluation" as a component where the CoT steps to carry out evaluation are produced by an LLM. The paper nor this repo, however, include the prompt defi…

calvdee updated 2 weeks ago
2
strickvl/mlops-dot-systems #13

posts/2024-07-01-full-finetuned-model-evaluation

# Alex Strick van Linschoten - My finetuned models beat OpenAI’s GPT-4 Finetunes of Mistral, Llama3 and Solar LLMs are more accurate for my test data than OpenAI’s models. [https://mlops.systems/pos…

utterances-bot updated 2 months ago
3
Datawheel/template-chatbot #7

12th of July Updates

RAG Evaluation 1. 100 questions Types of questions: - 60 on general trade - 12 on growth/variation - 28 on rankings 2. RAG evaluation results Best combination tested so far: multi-qa-mpne…

alebjanes updated 2 months ago
1
Azure-Samples/openai-apim-lb #13

Investigating Expression evaluation failure.

Hi, Everything seems to work fine with this apim script, but on logs I can see the following popping up in application insights with slow gpt-4 calls: _Expression evaluation failed. Unable to cast o…

tkumpumak updated 1 month ago
1
explodinggradients/ragas #1188

Integrating third-party LLMs for Evaluating Chinese-native R…

Hi there, Thank you for bringing the elegant RAG Assessment framework to the community. I am an AI engineer from Alibaba Cloud, and our team has been fine-tuning LLM-as-a-Judge models based on t…

hurenjun updated 2 weeks ago
8
zylon-ai/private-gpt #2085

[BUG] cannot import name 'BaseQueryEngine' from 'llama_index…

### Pre-check - [X] I have searched the existing issues and none cover this bug. ### Description 1. cd `/Users/zdavatz/Documents/software/privateGPT 2. I am doing `poetry run python3.11 -m private…

zdavatz updated 5 days ago
6
explodinggradients/ragas #1313

Test data generation for function calling

Hey, I was wondering if you think it would be possible to create a synthetic dataset for function calling tasks? I would like to use that dataset for a finetuning experiment. Thanks for any guida…

alexHeu updated 1 week ago
1
z-x-yang/DoraemonGPT #4

can not git clone videogpt

when pip install requirement.txt, there is error ```python remote: Support for password authentication was removed on August 13, 2021. remote: Please see https://docs.github.com/get-started/getti…

zhaishengfu updated 2 weeks ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for gpt-evaluation

1000+ results
for gpt-evaluation