Qiyuan-Ge / PaintMind

Fast and controllable text-to-image model.
Apache License 2.0
40 stars 5 forks source link

vqgan #1

Closed 1090h2400 closed 1 year ago

1090h2400 commented 1 year ago

Wow, that's great! Is this project the code for Vector-quantized Image Modeling with Improved VQGAN?

Qiyuan-Ge commented 1 year ago

Hi, I referred to that paper, but there are still some differences, for example, I am still using PatchGAN instead of StyleGAN(may try it in the future). Except that, I try to improve the vit architecture, like adding memory efficient attention from xformers and replacing GeLU to SwishGLU(from palm and llama)

1090h2400 commented 1 year ago

Hi, I referred to that paper, but there are still some differences, for example, I am still using PatchGAN instead of StyleGAN(may try it in the future). Except that, I try to improve the vit architecture, like adding memory efficient attention from xformers and replacing GeLU to SwishGLU(from palm and llama)

Anyway, thanks for your work. and an error occurred in the loading of the weight. The key of the weight and the key of the model do not match. The model or weights have been modified?

Qiyuan-Ge commented 1 year ago

Sorry, I explained that in README:

1090h2400 commented 1 year ago

Sorry, I explained that in README:

  • 2023/4/19 Note: Hi, I am preparing for a new release with better vitvqgan and generate model, pretrained weights from old version will not be available. Recently, I will release a stable version trained on a much bigger dataset.

Oh, I ignored this important information