-
### 🐛 Describe the bug
Parametrizations don't let you control what the original parameters are called; they're always original0, original1, etc. For weight_norm, this new naming is a bit obtuse; th…
-
Hello! I'm delighted to come across this remarkable project, and thanks for sharing it as an open-source project. Currently, my focus lies on fine-tuning the freevc-s model using pretrained checkpoint…
-
Can you please provide codes and WAVLM-TDCNN model weights for calculating speaker similarity score?
-
Hi Swin Transformer team,
We've recently added Swin Transformer to HuggingFace Transformers: https://huggingface.co/docs/transformers/master/en/model_doc/swin.
All checkpoints are on the hub: h…
-
### Tested versions
- 3.3
### System information
macOS, m1
### Issue description
Installing the most recent 3.3 version, trying out the new pixit pipeline i get the following errors (after downgr…
-
在【用Trinity和ZEGGS数据集进行训练时,train.py文件】方面遇到了一些困难,
![image](https://github.com/YoungSeng/UnifiedGesture/assets/37477030/7a138d2f-cc8f-4701-afea-32c40b4a3167)
其中loss的值一直为“nan”,不理解为什么会是这个值
-
您好,拜读了论文,想了解一下,模型的参数数量和运行时间大概是多久呢?或者说转换一秒的语音在3090显卡上需要多久的运算时间呢。
-
Thanks for the great work.
Knn-VC produces great results without even the need of training.
One thing I noticed is, I canned use more than 3 minutes of refernce audio.
If I use like 5 minutes of au…
-
Hello.
I have speech recordings in wav files, about 1-5 minutes each.
How do we extract embeddings using the `espnet/voxcelebs12_ecapa_wavlm_joint` SOTA model?
Documentation is overcomplicated.…
-
### Describe the bug
Similar to #3787, but also when running `xtts_v2` model with voice cloning (vocoder model), using `device='cpu'` results to the following error:
```
RuntimeError: CUDA error: …