vision-and-audio Search Results

1000+ results
for vision-and-audio

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

BerriAI/litellm #3607

[Feature]: Support for Reka AI

### The Feature It would be great if you would support the multimodal models of Reka.ai. The models understand text, images, video and audio. The documentation is here: https://docs.reka.ai/ …

InnerLive updated 2 months ago
5
fudan-generative-vision/hallo #28

Inference speed

Based on the table 7 in the paper Inference w. HADVS 9.77gb 1.63secs Inference w.o. HADVS 9.76gb 1.63secs Inference (256 × 256) 6.62gb 0.46secs Inference (1024 × 1024) 20.66gb 10.29secs As fa…

Inferencer updated 2 months ago
10
freedomcombination/monorepo #1294

Investigate GPT-4o

An email received from OpenAI > Hi there, > > We launched GPT-4o in the API—our new flagship model that’s as smart as GPT-4 Turbo and much more efficient. We’re passing on the benefits of the mo…

7alip updated 4 months ago
3
clamsproject/mmif #231

proposing subtypes of `TextDocument`

### New Feature Summary With a number of recent development, I'd like to propose more vocab types that are subcategories of `TextDocument` (all names are tentative in the proposal) - `Transcript`:…

keighrim updated 2 months ago
10
yyf17/awesome-embodied-intelligent #1

SoundSpace

# [sound-spaces](https://github.com/facebookresearch/sound-spaces) [Project: RLR-Audio-Propagation](https://github.com/facebookresearch/rlr-audio-propagation) [Audio Sensor](https://github.com/f…

yyf17 updated 2 years ago
1
DrCoffey/DeepSqueak #217

Detection Calls: Error in Network, skipping Audio Chunk

**Describe the bug** Dear DeepSqueak team, I'm using Matlab 2022b and installed the toolboxes required(Computer Vision System Toolbox™, Curve Fitting Toolbox™, Deep Learning Toolbox™ (formerly Neu…

katze-lucky updated 3 months ago
6
matthijsvrenswoude/IframeJellySeerr #1

Recommendations

Hey @matthijsvrenswoude, great idea. Was looking for something like this. Might want to look into building a plugin for adding IFrames in general? Here are some suggestions that might proof usef…

Bretterteig updated 3 months ago
1
simonw/llm #331

Multi-modal support for vision models such as GPT-4 vision

https://platform.openai.com/docs/guides/vision I think this is best handled by command line options `--image` and `--image-urls` to either encode and pass as base64, or to pass a URL.

cmungall updated 1 month ago
44
openai/evals #235

Evaluation on computer vision benchmarks

Are there plans to evaluate the vision modality of GPT-4? I am interested to know how GPT-4 could perform on classification tasks with 0- and few-shot-learning and how it compares to vision-only model…

finitearth updated 1 year ago
2
mastodon/mastodon #21902

Option to use audio files for image alt text

### Pitch When adding alt text, it would be nice to have the option to add (or ideally record) an audio file as the description. ### Motivation I think it would make ALT text easier to use, both fo…

OrElDuderino updated 1 month ago
7

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for vision-and-audio

1000+ results
for vision-and-audio