-
The paper mentions a codebook size of 4096 for all models with 128/64/32 tokens for 256x256 and 128/64 tokens for 512x512.
I was wondering why the example configuration in `README.md` and `titok.py` …
-
Hey thanks for the videos and codes, I am experimenting with conditional ldms.
Do you happen to have loss plots or logs of the loss? I have a feeling that the loss is decreasing really slowly or no…
-
excellent work!
Sorry, I'm not sure I understand clearly. Since we trained a VAE-model to map speech into discrete codes, then using a decoder-only transformer autoregressive model train with text-pr…
-
## Why
Machine Learning 輪講は最新の技術や論文を追うことで、エンジニアが「技術で解決できること」のレベルをあげていくことを目的にした会です。
prev. #7
## What
話したいことがある人はここにコメントしましょう!
面白いものを見つけた時点でとりあえず話すという宣言だけでもしましょう!
-
hey @adelacvg thank for sharing the code
after reading the code i want to ask you few question about new 24k model if you dont mind
1. what make different about this model from previous one (ht…
acul3 updated
1 month ago
-
## 一言でいうと
VAEを利用して離散の潜在表現を学習する試み。Encoderの出力に近いベクトルを埋め込み空間から検索し、そこからDecoderで復元する。分布はEncoderの出力に近いところに1、それ以外が0であるone-hotな分布として定義される。これにより離散表現を獲得できるだけでなく、Decoderが強力すぎる場合に潜在表現が学習されない問題を克服している。
![ima…
-
@robogast
Happy to report I was able to train a VQ-VAE using a dataset. Very cool to see - and kudos for the nice Tensorboard outputs you have in place! 😎
1. Do you have any suggestions or cod…
-
Is it possible to update the example for vqvae to also include how to use it on raw audio and/or video data?
Thank you
-
try to include acknowledgements for "indirect" contributors as well, i.e. people whose names don't appear in commits because their code was incorporated from a shared notebook or something like that.
dmarx updated
2 years ago
-
### 論文へのリンク
[[arXiv:1711.00937] Neural Discrete Representation Learning](https://arxiv.org/abs/1711.00937)
### 著者・所属機関
Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
- DeepMind
##…