ytongbai / LVM

Apache License 2.0
1.77k stars 55 forks source link

Question about inference details #14

Open ParanoidHW opened 11 months ago

ParanoidHW commented 11 months ago

Hi, yutong, firstly, very impressive work! I have a question about the number of tokens generated in each inference step. Does LVM a) auto-regressively produces tokens one-by-one like normal LLM and then each 256 tokens are partitioned and grouped to decode an image? Or, b) directly generates 256 tokens in one step inference?