audio-visual-speech-recognition Search Results

dynamic-superb/dynamic-superb #113

[Task]Emoji-Grounded Speech Emotion Recognition

# Task Name Emoji-Grounded Speech Emotion Recognition ## Task Objective The primary goal of the Emoji-Grounded Speech Emotion Recognition (EG-SER) task is to develop a system that can accurat…

ericsunkuan updated 1 week ago

kadirnar/ComfyUI-Transformers #12

ROADMAP of ComfyUI-Transformers

## Computer Vision: - [x] Add Depth Estimation pipeline - [ ] Add Image Classification pipeline - [ ] Add Image Segmentation pipeline - [ ] Add Mask Generation pipeline - [ ] Add Object Detecti…

kadirnar updated 1 week ago

owickstrom/komposition #94

Feature: speech recognition for visual feedback on audio

Could wire up speech recognition on the audio chunks to: - auto-name the clips when importing - show the text an the audio block in the timeline Bonus: add keyword / topic extraction.

robinp updated 4 years ago

GasimV/Commercial_Projects #2

Speech Processing Models

`torchaudio` is an extension library for PyTorch, designed to facilitate audio processing using the same PyTorch paradigms familiar to users of its tensor library. It provides powerful tools for audio…

GasimV updated 1 day ago

Sxjdwang/TalkLip #9

the face in output video is blurred

Hi, thanks for your great work! I tested talklip with my own video, but the generated face in output video is blurred and appear clear border with background. The resolution of my test video is 1600x9…

ZardYuan updated 6 months ago

Hangz-nju-cuhk/Talking-Face-Generation-DAVS #15

Table 3: Audio-Visual Speech Recognition and 1:25000 audio-v…

Hi, after reading the paper, I am confused about the table 3. What is the meaning of visual acc, audio acc and combine acc? How did you calculate the result of 67.5%, 91.8%, 95.2%? ![default](http…

zzzzhuque updated 5 years ago

huggingface/huggingface.js #174

Add inference demos

Add demos on https://huggingface.co/huggingfacejs (feel free to contribute demos, or to ask joining the organization) ### Natural Language processing - [ ] Fill mask - [ ] Summarization - [ ] …

coyotte508 updated 8 months ago

espressif/esp-adf #1203

ESP32-LyraTD-MSC board can't run ESP-ADF wwe example. (AUD-…

**## Environment** - Audio development kit: ESP32-LyraTD-MSC - Audio kit version: v2.2 - [Required] Module or chip used: ESP32-WROVER-E - [Required] IDF version: v5.2.1 - [Required] ADF v…

hyuan-kamuda updated 1 month ago

Uberi/speech_recognition #383

Python speech_recognition.UnknownValueError

I'm trying to make a virtual assistant, right now it's suppost to just write down what I say. However when I try to test it it returns, Traceback (most recent call last): File "/Users/danieldossant…

DanielBDosSantos updated 7 months ago

yyf17/NavigationProject #9

RIR

# 格式 * **Paper Title** *Author(s)* Confenrence, year. [[Paper]](link) [[Code]](link) [[Website]](link) 需要填充： 1）Paper Title 2） Author(s) 3） 3个“link” 4）两篇文章之间间隔一行 5) confenrence, year …

yyf17 updated 1 year ago

435 results for audio-visual-speech-recognition

435 results
for audio-visual-speech-recognition