-
**Tim**
Igor Petrović E9 8/2023
**Definicija problema**
Cilj projekta je izrada Speech-to-text sistema za transkripciju prirodnog govora. Kao ulaz dolazi zvuk u vidu nekog standardnog audio zapis…
-
Hi,
Do you compare your ckpt with the vae/vqvae [here](https://github.com/CompVis/latent-diffusion)?
![图片](https://github.com/TencentARC/Open-MAGVIT2/assets/33491471/596a630a-e4b6-40e6-978f-de8a…
-
**[The following two posts are my reply to /u/starspawn0's comment on the paper [*Language Modeling for Formal Mathematics*](https://arxiv.org/abs/2006.04757) by Christian Szegedy et al., posted on su…
-
Hi,
I had a quick question about the HumanML3D training sequences for the VQ-VAE. It seems that all the sequences in the dataset start with a different initial translation. This means that the VQ-V…
-
Just as the question. I stuck in finetuen the VAE model on my costum medical image datasets. I adop the LPIPSWithDiscriminator Loss from [https://huggingface.co/spaces/multimodalart/latentdiffusion/tr…
-
### 🐛 Describe the bug
## Minimum reproduction
```python
import torch.nn.functional as F
import torch
from torch import nn
class GumbelVectorQuantizer(nn.Module):
def __init__(self):
…
-
I load "net_last.pth" for VQ-VAE and "net_best_fid.pth" for the Transformer in 'VQTransformer_corruption05', and run the GPT_eval_multi.py code, and I can only achieve about **0.22 FID**, which is hig…
-
Hello,
Thank you for sharing great work.
I'm curious about the training setups for additional experiments such as beta-vae imitation policy and the high-level policy for the beta-vae policy.
It…
-
Hi,
Is it possible to just use the encoder to encode the motion and just regress the SMPLs directly without going through the decoder?
Say I just want to generate a couple seconds of motion. I t…
-
I have done a custom PyTorch implementation of the vq-vae-2 using a very similar approach to what you do in this code, and I manage to get pretty accurate regenerated images, but the problem comes whe…