FoundationVision LlamaGen issues

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

https://arxiv.org/abs/2406.06525

MIT License

1.33k stars 55 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

About finetuning on pretrained LLM models.

#73 ZeyuLing opened 1 week ago
0
About `data_range` in ssim_loss

#72 xyfJASON opened 2 weeks ago
0
Why use such a big model---google/flan-t5-xl

#71 ranck626 opened 2 weeks ago
0
dataloader的batch_size设置

#70 yyyxcleo opened 3 weeks ago
0
和AIM的区别

#69 Jmh0527 opened 3 weeks ago
0
StyleGAN vs PatchGAN

#68 sunset-clouds opened 3 weeks ago
0
cfg-interval

#67 wangyf8848 opened 3 weeks ago
0
Embbeding layer

#66 wangyf8848 closed 1 month ago
0
How to reproduce the codebook usage

#65 BaohaoLiao opened 1 month ago
0
Why is the model GPT in the code？

#64 wangyf8848 opened 1 month ago
1
The demo not work well

#63 bigbrother001 opened 1 month ago
0
The effect of VQVAE's training data on image generation

#62 HalvesChen opened 1 month ago
0
Recommendation for decoder finetuning

#61 elias-ramzi opened 2 months ago
0
Only inference

#60 heavenhellchen opened 2 months ago
0
请问如何用我自己的数据集训练 Image tokenizers and AR models for text-conditional image generation？请问可以提供一个示例吗，谢谢

#59 gzhuinjune opened 2 months ago
0
When I set ipdb in gpt.py, I encounter this error,torch._dynamo.exc.InternalTorchDynamoError: `example_value` needs to be a `FakeTensor`wrapped by this instance of Dynamo. Found: tensor(..., device='meta', size=(2,))

#58 BinZhu-ece opened 3 months ago
0
[Add]sigle gpu generation file

#57 YecanLee closed 3 months ago
0
About train losses and evalution parameters setting

#56 MrCrims opened 3 months ago
1
About evaluation on private dataset

#55 MrCrims opened 3 months ago
0
About ROPE in sample process

#54 Leedonus opened 3 months ago
6
Question about cannot reproduce FID results

#53 Ghy0501 closed 3 months ago
2
add t5 extraction instructions in Readme or Getting started for t2i training

#52 sahil02235 opened 3 months ago
0
Loss increases during training the T2I model

#51 Epiphqny closed 3 months ago
0
T2I VQVAE Training Details

#50 alexanderswerdlow opened 3 months ago
0
Mask guidance, inpaiting and outpaiting

#49 sahil02235 opened 4 months ago
7
Cannot Reproduce LlamaGen-B or L numbers using provided models

#48 vkramanuj closed 4 months ago
1
T2I performance on mscoco

#47 HalvesChen opened 4 months ago
1
FID results of GPT-L and GPT-1B on 256*256 images

#46 LutingWang opened 4 months ago
3
KeyError: 'optimizer'

#45 sugary199 opened 4 months ago
4
Can LlamaGen predict a [EOS] token when inferencing?

#44 luminousking opened 4 months ago
6
Test

#43 Ryankwon03 closed 4 months ago
0
Do you try class 2 Image generation with the image resolution of 512X512?

#42 OliverRensu opened 4 months ago
0
Difficulty in reproducing results with pre-trained weights

#41 Rishit-dagli opened 4 months ago
1
tokenizer of 4 dim

#40 DidiD1 opened 4 months ago
0
Training Results

#39 Huage001 opened 4 months ago
4
Questions about the results of your experiment.

#38 potatowarriors opened 4 months ago
2
T2I Data

#37 HalvesChen closed 4 months ago
0
你好，vq_ds16_c2i_training.pt 在私有的通用图片上finetune效果变差了，就试了在imgenet数据集接着finetune，效果也变差了，想问一下是哪里出问题了？

#36 353xiong opened 5 months ago
1
Questions about the discriminator

#35 Doctor-James closed 4 months ago
1
FID Evaluation not matching paper results for VQ-16 checkpoint

#34 vkramanuj closed 4 months ago
3
Text embedding inject

#33 daiyixiang666 closed 5 months ago
2
Inquiry about the OpenImages dataset

#32 RobertLuo1 closed 5 months ago
2
Question about why not try using image tokenizer and a ready made llama3 etc LLM model with lora?

#31 lucasjinreal opened 5 months ago
9
Question about text-conditional generation.

#30 Yangr116 closed 5 months ago
2
你好，我使用您的代码训练了我自己的数据集，但是图片变得特别模糊

#29 gzhuinjune opened 5 months ago
9
Train script

#28 potatowarriors opened 5 months ago
1
Discriminator is not training properly?

#27 ThisisBillhe closed 5 months ago
4
Which parameters are trainable? Are the encoder and decoder in VQGAN fixed? Is the llama fixed?

#26 tanshuai0219 opened 5 months ago
1
[Feature] Inpainting script

#25 kabachuha opened 5 months ago
0
[Feature] ControlNet support via process similar to PixArt's ControlNet-Transformer

#24 kabachuha opened 5 months ago
1