Closed lucapinello closed 1 year ago
@sg134 can you write here where you are with this and how people can contribute/help you?
@lucapinello Would love to work on this however I don't understand why do we need to work with VQ-VAE. Shouldn't we directly prototype with DDPMs?
The idea is to derive a good embedding for DNA-sequences so we can explore later stable diffusion. Right now we are diffusing directly on the one-hot-encoding of the DNA sequences.
Hi, just saw this (sorry). As Luca mentioned, we hope to represent the DNA sequences in a smaller latent space and pursue latent diffusion. To that end, if you have any other model suggestions to encode the sequences into a representation (another VAE variant for example), feel free to suggest to suggest and implement them -- we don't necessarily know if VQ-VAE would be the best model for this dataset. I started with this model because it was used in the DALL-E paper. Currently some of the next steps planned for the VQ-VAE:
@lucapinello @LucasSilvaFerreira Is there anything else to include or clarify?
gotcha. I'd like to work on this issue. Can you assign it to me?
@mihirneal and @sg134 I would recommend that you guys create a subgroup to explore it together. @sg134 already has some code, and it would be nice if he can guide you through it. I think it will be nice to have a (latent) stable diffusion model working on these sequences.
@sg134 let me know if I can help with the VQVAE
Hi @mihirneal & @mateibejan1, is it possible that we can allocate a few minutes during the sprint meeting to discuss the VQ-VAE code and next steps for others who are interested as well?
Yeah, that’s what I had in mind as well.
Sure, I'll devise a meeting planning. We'll start with a retrospective about what has been done in sprint 1, then talk current tasks and finally what we'll do. Sounds good for you schedule @sg134 ?
@sg134 please contact me when you see this.
noahweber53@gmail.com
thanks
@noahweber1 messaged you on Discord.
Summary of what we agreed upon and what are the next steps:
This issue is stale because it has been open for 60 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.
Current notebook here: https://github.com/pinellolab/DNA-Diffusion/blob/latent-space-representation/vq_vae_diffusion.ipynb