gpt-evaluation Search Results

1000+ results
for gpt-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

irthomasthomas/undecidability #901

[2303.16634] G-Eval: NLG Evaluation using GPT-4 with Better …

- [ ] [[2303.16634] G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment](https://arxiv.org/abs/2303.16634) # [2303.16634] G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment …

ShellLM updated 1 month ago
1
explodinggradients/ragas #1278

Error w/ evaluate function using LLamaIndex AzureOpenAI mode…

[ ] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** Further request for LLamaIndex support regarding Azure OpenAI…

sam-h-long updated 1 week ago
6
18907305772/FuseAI #21

about gpt-4-0125-preview reference answer

hello, 我想咨询一下在MT-bench上测试时，使用的reference answer 是通过 gen_api_answer.py --model gpt-4-0125-preview这个命令来获取的吗？生成的reference answer有80个，然后把其中100～130个用official comment[https://github.com/lm-sys/FastChat/…

duguodong7 updated 1 week ago
4
yipoh/AesBench #5

Where is the Supplementary of paper

I want to know the details of GPT-based evaluation, but cannot find the Supplementary in paper.

Wenju-Huang updated 2 weeks ago
1
OpenCodeInterpreter/OpenCodeInterpreter #23

tree_sitter_languages.core.get_parser error

Hi, I'm trying to run the multi-turn evaluation for gpt-3.5. I have re-implemented the chat_with_gpt.py. However, when I ran: bash evaluation/evaluate/scripts/05_execution_feedback_multiround_gpt.s…

chenmengdx updated 3 months ago
1
microsoft/JARVIS #208

Evaluation Dataset mentioned in Hugging GPT paper is not ava…

As mentioned in the paper - "Furthermore, we also invite some expert annotators to label task planning for some complex requests (46 examples) as a high-quality human annotated dataset. We also plan t…

ssdasgupta updated 1 month ago
2
confident-ai/deepeval #897

Stuck while suing summarisation metric

I was testing my Hindi summarization model and while calculating the evaluation metric for SUMMARIZATION I ran the following cell, but it kept on running for too long and did not give me any output. I…

Vedant336Neekhra updated 2 months ago
1
hy5468/TransLLM #1

How do you evaluate the model you train? So far I see transl…

Hi TransLLM owner, Do you have like benchmark data where expected output is provided? Cheers!

pacozaa updated 2 weeks ago
1
AkihikoWatanabe/paper_notes #1401

Instruction Tuning with GPT-4, Baolin Peng+, N/A, arXiv'23

# URL - https://arxiv.org/abs/2304.03277 # Affiliations - Baolin Peng, N/A - Chunyuan Li, N/A - Pengcheng He, N/A - Michel Galley, N/A - Jianfeng Gao, N/A # Abstract - Prior work has shown…

AkihikoWatanabe updated 2 days ago
1
csce585-mlsystems/Phishing-Detection #1

Instructions for Designing Your Experiments and Creating a M…

#### Specific Task: For this project, your main challenge is improving phishing detection by developing a real-time, multimodal system based on transformers and other features like URLs and metadata.…

pooyanjamshidi updated 3 days ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for gpt-evaluation

1000+ results
for gpt-evaluation