-
Currently, `mask` can only support ops with output sizes expressible as polynomials of the input sizes. This excludes:
- Strided slicing and convolutions (only supported for edge case when the stride…
-
在执行训练命令时报错了。
命令:!python3.8 finetune_speaker_v2.py -m "./OUTPUT_MODEL" --max_epochs 1000 --drop_speaker_embed True
报错日志:
`INFO:OUTPUT_MODEL:{'train': {'log_interval': 10, 'eval_interval': 100, 'se…
-
In the [Model Zoo](https://github.com/facebookresearch/vissl/blob/main/MODEL_ZOO.md), an accuracy of 83.38 is reported. However, the experiment configuration is not shared in the json file. I try to r…
-
Hi, In **NATSpeech**, inappropriate dependency versioning constraints can cause risks.
Below are the dependencies and version constraints that the project is using
```
matplotlib
librosa==0.8.…
-
In the white paper, they mention conditioning to a particular speaker as an input they condition globally, and the TTS component as an up-sampled (deconvolution) conditioned locally. For the latter, t…
-
This repo is using masked dense convolutions because it is optimized in torch. However, would [this](http://www.open3d.org/docs/latest/python_api/open3d.ml.torch.nn.SparseConv.html) implementation spe…
-
-
-
**Describe the issue**
Hi, I'm going to merge InternImage model into mmdetection==2.28.1, and I encounter the error:
```bash
AssertionError: DINO: DINOHead: The classification weight for loss and…
-
How would one do custom pretraining with the convnextv2 model. I recall in the paper, they used sparse convolutions and masked autoencoding framework?