audio-spectrogram-transformer Search Results

251 results
for audio-spectrogram-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/optimum #1138

ValueError: whisper doesn't support task audio-classificatio…

### Feature request It would be great if we could perform audio-classification with whisper. As an example, it could be used for language detection: - https://huggingface.co/sanchit-gandhi/whisper…

xenova updated 1 year ago
2
JuliaMusic/MusicProcessing.jl #35

Music Transcription: GSOC'22

Hey @Datseris!! I saw this music transciption project on Julialang website for JuliaMusic which is an exciting project. I wanted to discuss what you are looking for and would love make a proof of con…

ashwani-rathee updated 1 year ago
5
Ankur3107/ankur3107.github.io #2

blogs/the-illustrated-image-captioning-using-transformers/

# The Illustrated Image Captioning using transformers - Ankur NLP Enthusiast The Illustrated Image Captioning using transformers [https://ankur3107.github.io/blogs/the-illustrated-image-captioning-u…

utterances-bot updated 2 months ago
36
keonlee9420/Cross-Speaker-Emotion-Transfer #11

Synthesis with other person out of RAVDESS

Hello author, Firstly, thank you for giving this repo, it is really nice. I have a question that: 1. I download CMU data with single person with 100 audios and make speaker embedding vector and sy…

hathubkhn updated 1 year ago
6
p0p4k/pflowtts_pytorch #37

about zero-shot inference

Hello p0p4k, I'm reaching out to you again with a question. Thanks to your great help, I've successfully trained and inferred the Korean pflow model. During the inference process, I observed a f…

0913ktg updated 6 months ago
31
huggingface/transformers #16349

[WIP] New Model Add FastPitch 1.1

# 🌟 New model addition ## Model description **What type of model is Fast Pitch 1.1?** It is a Mel spectrogram generator (part of a speech to text model engine) that mainly comprises of two F…

ArEnSc updated 2 years ago
4
robertknight/rten #332

Incorrect convolution output shape for MIT/ast-finetuned-aud…

When testing the model from https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593 I encountered an issue where inference fails due to a broadcast mismatch. **Steps to reproduce:** ``` …

robertknight updated 1 month ago
1
anthonio9/penn #10

Thesis Story

anthonio9 updated 3 months ago
10
glouppe/kaggle-marinexplore #1

Cleanup input signal

For the moment, I have only been experimenting with the raw signal data and its FFT transform. However, I am sure much can be gained by cleanup the input signal. From the top of my head, things worth…

glouppe updated 11 years ago
48
kaiidams/NeMoOnnxSharp #26

Possible to improve English and German pronunciation?

# NVIDIA NeMo (ByT5 G2P and G2P-Conformer): > NVIDIA NeMo provides grapheme-to-phoneme models for various languages, including **German**. > The ByT5 G2P model is based on a neural network and can…

GeorgeS2019 updated 6 months ago
9

上一页 1...1 2 3 4 5 6 7...26 下一页

251 results for audio-spectrogram-transformer

251 results
for audio-spectrogram-transformer