melspectrogram Search Results

781 results
for melspectrogram

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

espnet/espnet #5616

TTS Before and After Postnet Outputs

Hello! I am trying to train a customized TTS model using the VCTK dataset and Conformer FastSpeech 2 setup. After 50 epochs (with a batchsize of 2), I noticed that the outputs of the decoder (befor…

migi-gon updated 9 months ago
1
kkoutini/PaSST #38

RuntimeError: stft requires the return_complex parameter be …

Hello! I am using the following code: ``` from hear21passt.base import get_basic_model,get_model_passt import torch # get the PaSST model wrapper, includes Melspectrogram and the default pre-tr…

loukasilias updated 11 months ago
3
pytorch/audio #2696

Add support for Modified Discrete Cosine Transform (MDCT)

### 🚀 The feature The [Modified Discrete Cosine Transform (MDCT)](https://en.wikipedia.org/wiki/Modified_discrete_cosine_transform) is a perfectly invertible transform that can be used for featur…

Kinyugo updated 1 year ago
16
Rudrabha/Wav2Lip #683

wav2lip [mp3 @ 000002a6d5b45d80] Estimating duration from bi…

(env) (base) C:\Users\prost\Wav2Lip>python inference.py --checkpoint_path checkpoints/wav2lip_gan.pth --face joseph.mp4 --audio josephvoice.mp3 Using cpu for inference. Reading video frames... Numb…

maic2209 updated 5 months ago
2
Plachtaa/FAcodec #8

代码细节问题

您好，请问 FAcodec/modules /quantize.py中FApredictors中forward_v2函数注释掉了 `spk_pred = self.timbre_predictor(timbre)[0]` 这行代码，因此timbre为None，这里会导致后面 ``` spk_pred_logits = preds['timbre'] spk_loss …

check-777 updated 4 months ago
4
pytorch/audio #1090

Add spoken digit dataset and baseline model

## 🚀 Feature Given that there is a lack of small and comprehensive audio tasks, I would propose to add a speech MNIST dataset to torch audio. ## Motivation In the audio domain, we often lack s…

faroit updated 3 years ago
5
rishikksh20/AudioMAE-pytorch #1

How to get the audio image?

Thanks for your outstanding work. I'm just getting started with speech signal processing and I have a question. For the example in the readme file, the input is a 1024*128 image, how should we get t…

Ian-Tam updated 2 years ago
1
openspeech-team/openspeech #105

Support for partial data usage for LibriSpeech

There should be a functionality where instead of having to download entire dataset and train on it, we could download just partial data and use only that for training. And if not, then the documentati…

kushal-g updated 3 years ago
2
Rudrabha/Wav2Lip #589

audio time != mel time

i use 16k mono ,6.7025s wav , mel 167 frame->167 *40 ms=6.68s the same ,i use 6s wav , mel 147 5.58s why? issue:audio also play , the mel have no

mathpopo updated 10 months ago
1
carpedm20/multi-speaker-tacotron-tensorflow #61

generate_data.py 실행시 ValueError 발생 문제입니다.

python -m datasets.generate_data ./datasets/son/alignment.json 를 실행시키면 나오는 에러입니다. 다른건 issue에서 글을 찾아보고 인터넷 검색을 해서 해결해왔는데, 이건 도무지 감이 잡히지 않습니다. 괜찮으시면 해결방법을 얻어갈 수 있을까요? 그리고 저 하단 에러항목인 n_frame은 어디에 사용되는…

topkcaj49 updated 5 years ago
3

上一页 1...7 8 9 10 11 12 13...79 下一页

781 results for melspectrogram

781 results
for melspectrogram