lucidrains / titok-pytorch

Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
MIT License
159 stars 3 forks source link

Training estimates #3

Closed kabachuha closed 2 months ago

kabachuha commented 2 months ago

Hi! If you did some independent tests, how fast does it converge in your experience? And how much compute resources does it use for medium-size resolutions (256-512)?

kabachuha commented 2 months ago

Okay, from the other issue it seems it's not very good direction without a preexisting vqvae

Though still may be practical 🤔