gpt-evaluation Search Results

1000+ results
for gpt-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PKU-YuanGroup/Chat-UniVi #9

Question regarding GPT scores for video comprehension

Thanks for the great work. And I have a question about the GPT scores for video comprehension. I evaluated the GPT score for video comprehension using the evaluation code you published and the hugg…

yuu2704 updated 5 months ago
2
UChicago-Computational-Content-Analysis/Readings-Responses-2024-Winter #42

3. Clustering & Topic Modeling to Discover Higher-Order Patt…

Post your response to our challenge questions. First, write down three intuitions you have about broad content patterns you will discover in your data. Plan an asterisk next to the one you expect m…

lkcao updated 6 months ago
30
paul-gauthier/aider #533

Evaluate performance against SWE-Bench

It would be interesting to see if/how `aider` performs against the SWE-Bench benchmarks: - https://www.swebench.com/ - https://github.com/princeton-nlp/SWE-bench - > [ICLR 2024] SWE-Bench: Can …

0xdevalias updated 3 months ago
4
Y-IAB/lm-evaluation-harness #14

Every evaluation costs money

It would be great if we could group the tasks by whether they require money or not. In short, we need to split tasks by their need for OpenAI API configurations.

seungduk-yanolja updated 5 months ago
3
wbbeyourself/MAC-SQL #14

Help wanted! Question about Bird dataset results on the pape…

![image](https://github.com/wbbeyourself/MAC-SQL/assets/80022154/50318e98-221d-467e-bab3-3eb5791113b7) In question #7 , I see that the results of spider in the paper are obtained by GPT-4-32K, an…

kanseaveg updated 5 months ago
2
Agenta-AI/agenta #1595

[AGE-163] Propagating the cost from Span to Trace

Right now the user needs to explicitly return in the traced function a dict that contains the cost, message, and number of tokens. However, this information is simply the sum of costs and tokens used…

mmabrouk updated 3 months ago
4
uptrain-ai/uptrain #644

gpt-4-turbo-preview support not available in OpenAI models

**Is your feature request related to a problem? Please describe.** gpt-4 is very costly and gpt-3.5 provides low grade output. I'd like to use gpt-4-turbo for evaluation **Describe the solution yo…

deveshXm updated 6 months ago
12
A-suozhang/GetArxivDaily #35

New submissions for Mon, 17 Apr 23

## Keyword: efficient ### End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs - **Authors:** Javier Campos, Zhen Dong, Javier Duarte, Amir Gholami, Michael W. Mahoney,…

A-suozhang updated 1 year ago
1
GENZITSU/UsefulMaterials #107

weekly useful materials - 07/19 -

GENZITSU updated 2 years ago
10
microsoft/semantic-kernel #5436

.Net: Filters use cases to be supported before making the fe…

Tasks - [x] Create showcase application which demonstrates all functionality below To make filters non-experimental, the following user stories should be met: - [x] **Telemetry** – any of the telem…

matthewbolanos updated 2 months ago
11

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for gpt-evaluation

1000+ results
for gpt-evaluation