ai-benchmark Search Results

vizhub-core/vzcode #847

AI Benchmark System

The VZCode AI Assist feature prompting system is currently based on trial and error. There are no tests or benchmarks to evaluate the quality of the prompt and test it under various scenarios. We ne…

curran updated 2 weeks ago

Xilinx/Vitis-AI #1475

Running Vitis AI benchmarks on VEK280 production B03

Hi, we have a VEK280 production silicon board B03 and we've been trying to benchmark Vitis AI on it. The quick start guide (https://xilinx.github.io/Vitis-AI/3.5/html/docs/quickstart/vek280.html) h…

chensy7 updated 1 week ago

google-gemini/generative-ai-python #573

Internal Server Error (500) when using Gemini API with Inspe…

### Description of the bug: ## Description I'm consistently encountering an Internal Server Error (HTTP 500) when trying to use the Google Gemini API through the Inspect evaluation framework. This…

lennijusten updated 1 week ago

akatz-ai/ComfyUI-Depthflow-Nodes #6

TypeError: ShaderScene.main() got an unexpected keyword argu…

2024-10-07 16:32:58.435003 │DepthFlow├┤9'41.685├┤INFO │ ▸ (Module 115 • CustomDepthf) Initializing scene 'CustomDepthflowScene' with backend headless │DepthFlow├┤9'41.783├┤INFO │ ▸ (Module 115 • …

Maelstrom2014 updated 22 hours ago

ultralytics/ultralytics #16591

Is there any benchmark available for the performances of YOL…

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

Haseeeb21 updated 1 week ago

getappmap/appmap-js #2006

Retry on Sonnet error "Output blocked by content filtering p…

Error [example](https://github.com/getappmap/navie-benchmark/actions/runs/10949246453/job/30402054109#step:7:1216): ``` Handling exception: Error: Failed to complete: SSE Error: {"type":"error","e…

kgilpin updated 2 weeks ago

ibm-granite-community/pm #133

RAG recipe - Legal use case

Steps: - Develop use case - Find and prep data - Load and query Resources: - [ Caselaw Access Project (CAP)](https://case.law/) - [US Code](https://uscode.house.gov/) - [Legal AI Benchmarks](https:…

fayvor updated 1 day ago

rmusser01/tldw #237

Feature Tracker: Evaluation Benchmarks

Title. Benchmarks: Summarization - [x] G-Eval - [ ] SummHay - https://arxiv.org/abs/2407.01370v1 & https://github.com/salesforce/summary-of-a-haystack - https://arxiv.org/html/2403.19889v1 R…

rmusser01 updated 1 week ago

opensearch-project/project-website #3295

Proposed blog categories for web filtering

Approved OpenSearch blog categories for building a blog filter for this page: https://opensearch.org/blog/ **Main headers:** * Technical * Community * Partners * Events * Releases **Sub-t…

pajuric updated 2 days ago

BradyFU/Awesome-Multimodal-Large-Language-Models #182

Request for adding new MLLM benchmark: OmniBench

Hi, Nice work on indexing useful papers in this repo! My team has released a new benchmark for the ability for "omni understanding" the image, audio and text of the MLLMs. We are quite confiden…

yizhilll updated 23 hours ago

1000+ results for ai-benchmark

1000+ results
for ai-benchmark