-
### The Feature
It would be great if you would support the multimodal models of Reka.ai.
The models understand text, images, video and audio.
The documentation is here:
https://docs.reka.ai/
…
-
Based on the table 7 in the paper
Inference w. HADVS 9.77gb 1.63secs
Inference w.o. HADVS 9.76gb 1.63secs
Inference (256 × 256) 6.62gb 0.46secs
Inference (1024 × 1024) 20.66gb 10.29secs
As fa…
-
An email received from OpenAI
> Hi there,
>
> We launched GPT-4o in the API—our new flagship model that’s as smart as GPT-4 Turbo and much more efficient. We’re passing on the benefits of the mo…
7alip updated
4 months ago
-
### New Feature Summary
With a number of recent development, I'd like to propose more vocab types that are subcategories of `TextDocument` (all names are tentative in the proposal)
- `Transcript`:…
-
#
[sound-spaces](https://github.com/facebookresearch/sound-spaces)
[Project: RLR-Audio-Propagation](https://github.com/facebookresearch/rlr-audio-propagation)
[Audio Sensor](https://github.com/f…
yyf17 updated
2 years ago
-
**Describe the bug**
Dear DeepSqueak team,
I'm using Matlab 2022b and installed the toolboxes required(Computer Vision System Toolbox™, Curve Fitting Toolbox™, Deep Learning Toolbox™ (formerly Neu…
-
Hey @matthijsvrenswoude,
great idea. Was looking for something like this. Might want to look into building a plugin for adding IFrames in general?
Here are some suggestions that might proof usef…
-
https://platform.openai.com/docs/guides/vision
I think this is best handled by command line options `--image` and `--image-urls` to either encode and pass as base64, or to pass a URL.
-
Are there plans to evaluate the vision modality of GPT-4? I am interested to know how GPT-4 could perform on classification tasks with 0- and few-shot-learning and how it compares to vision-only model…
-
### Pitch
When adding alt text, it would be nice to have the option to add (or ideally record) an audio file as the description.
### Motivation
I think it would make ALT text easier to use, both fo…