jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Other
438 stars 22 forks source link

Reproduce the reconsturction results of Fig. 7 #19

Closed xizaoqu closed 2 months ago

xizaoqu commented 2 months ago

Hi, really exciting work. I have questions about the reconstruction results in Fig.7. Do I need to append some text to reconstruct the origin image? Since if I only use the prompt "reconstruct it", the result is not that satisfying image image image

jy0205 commented 2 months ago

If you want to reconstruct the original image, just using the decoder is enough. We will release the code about reconstruction soon.

xizaoqu commented 2 months ago

Thanks, directly using the token before LLM makes the reconstruction better. image