Closed ld-ing closed 3 months ago
Hi @ld-ing, thank you again for writing this tutorial! I revised the notebook a bit; here are the changes I made:
- Updated dependencies -- ribs[visualize] incorporates shapely and matplotlib
- Replaced alive_progress / alive_bar with tqdm
- Switched tutorial links to link to the latest versions rather than stable
- Minor grammar fixes
- Switched from optimizer to scheduler
- Added comments in the fit_dis_embed function for clarity
- Added comments in the training loop for clarity
Would you also mind making the following changes?
- In the section on "Train Diversity Metrics through Contrastive Learning", could you add a brief explanation of how the contrastive learning part of QDHF works? In particular, right now it is a bit difficult to understand all the code related to the DisEmbed, but I think it would all make sense with an explanation of how the DisEmbed is being used. No need to go too far into details; you can always refer readers to the paper for that.
- By the way, what does the Dis mean in DisEmbed?
- Regarding the
fit_dis_embed
function, could you add a comment explaining how the loss function works in the training loop? It seems to be a bit different from Eq. 3 in the paper.- Also, I'm unclear how
gt_measures
works -- does the DreamSim model output some features, which are the "gt_measures", which you then use to figure out the preference? (i.e., the 2AFC mentioned in 3.2). If so, I think it would be good to explain this infit_dis_embed
- Could you mention somewhere that CLIP is used to embed the image into 512d, and QDHF learns to embed 512d into 2d?
- Should CLIP be using ViT-B/16 or ViT-B/32? The QDHF paper mentions ViT-B/16 but this tutorial uses ViT-B/32.
Overall, I think it looks great! My comments all center on
fit_dis_embed
because this is one of the key parts of the tutorial, so I am hoping to make the code as clear as possible to readers. Once you make these changes, I'll run the notebook on Colab again to get a "golden" version for the tutorials.
Hi @btjanaka, thanks for the edits and detailed comments! I have revised the code to resolve these issues. Here are the changes:
Besides, I also changed the organization a little bit to make things clearer.
Again, thanks for these insightful suggestions, and let me know if you have other comments!
Description
Add a tutorial that uses Quality Diversity through Human Feedback (QDHF) to improve the diversity of Stable Diffusion image generations. The tutorial showcases how people can do QD optimization without manually crafted diversity metrics. It also extends the previous DQD tutorial with an LSI pipeline using Stable Diffusion.
TODO
Status
yapf
pytest
pylint
HISTORY.md