-
In the [paper](https://www.biorxiv.org/content/10.1101/2024.07.01.600583v1), we can read
> In the first stage, a smaller decoder trunk consisting of 8 Transformer blocks with width 1024, rotary pos…
-
Hi Teams,
Is it possible to release the VQ-VAE model weights for the structural tokenizers before the Stage 2 training? That's one with a shallower decoder compared the released one.
-
Hello,
I want to apply Latent Diffusion Model to medical images data. How can I feed training images from a directory of .jpg files to the train the diffusion model ?
Plus, I don't want the mode…
-
First thank you sharing this great work.
I still confuse about the usage of the VAE.
In the demo_sample.ipynb, the VAE is built by build_vae_var func. However, during the sampling process, the VAE …
-
Hi!
Thank you for your great work. I would like to ask if your model can be used solely for encoder and decoder of VQ-VAE. Because my model work into latent space for my project.
If so, do you h…
-
I am very interested in your excellent work Foldseek. I want to re-train it on my own dataset based on your GitHub repository “foldseek-analysis”. I checked the code and found that it saves three fil…
-
Hi ESM3 Team,
First of all, congratulations on your outstanding research work.
I am particularly excited by the VQ-VAE structure proposed in your model.
Upon examining your code and detailed ap…
-
The single-gpu training process works just fine for me, and the output samples are satisfactory. However when I set `--n_gpu 2` or `--n_gpu 4`, the training process will get stuck at the beginning(Ep…
-
Hi,
Can we train the autoencoder only, by fixing the ddim?
I want to train an autoencoder on a feature vector of size 64x64x256 and expect to get the z_sem which can work with the pretrained ddim. …
-
inpainting在pixel空间操作很好理解,但是到了latent空间就不好理解了。 latent空间是把图像用特征表示,失去了pixel空间位置信息,那是如何做到找到mask区域的呢? 进入latext空间后的矩阵维度位置和pixel空间维度位置没有对应关系吧?