issues
search
FoundationVision
/
LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
https://arxiv.org/abs/2406.06525
MIT License
1.33k
stars
55
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
About finetuning on pretrained LLM models.
#73
ZeyuLing
opened
1 week ago
0
About `data_range` in ssim_loss
#72
xyfJASON
opened
2 weeks ago
0
Why use such a big model---google/flan-t5-xl
#71
ranck626
opened
2 weeks ago
0
dataloader的batch_size设置
#70
yyyxcleo
opened
3 weeks ago
0
和AIM的区别
#69
Jmh0527
opened
3 weeks ago
0
StyleGAN vs PatchGAN
#68
sunset-clouds
opened
3 weeks ago
0
cfg-interval
#67
wangyf8848
opened
3 weeks ago
0
Embbeding layer
#66
wangyf8848
closed
1 month ago
0
How to reproduce the codebook usage
#65
BaohaoLiao
opened
1 month ago
0
Why is the model GPT in the code?
#64
wangyf8848
opened
1 month ago
1
The demo not work well
#63
bigbrother001
opened
1 month ago
0
The effect of VQVAE's training data on image generation
#62
HalvesChen
opened
1 month ago
0
Recommendation for decoder finetuning
#61
elias-ramzi
opened
2 months ago
0
Only inference
#60
heavenhellchen
opened
2 months ago
0
请问如何用我自己的数据集训练 Image tokenizers and AR models for text-conditional image generation?请问可以提供一个示例吗,谢谢
#59
gzhuinjune
opened
2 months ago
0
When I set ipdb in gpt.py, I encounter this error,torch._dynamo.exc.InternalTorchDynamoError: `example_value` needs to be a `FakeTensor`wrapped by this instance of Dynamo. Found: tensor(..., device='meta', size=(2,))
#58
BinZhu-ece
opened
3 months ago
0
[Add]sigle gpu generation file
#57
YecanLee
closed
3 months ago
0
About train losses and evalution parameters setting
#56
MrCrims
opened
3 months ago
1
About evaluation on private dataset
#55
MrCrims
opened
3 months ago
0
About ROPE in sample process
#54
Leedonus
opened
3 months ago
6
Question about cannot reproduce FID results
#53
Ghy0501
closed
3 months ago
2
add t5 extraction instructions in Readme or Getting started for t2i training
#52
sahil02235
opened
3 months ago
0
Loss increases during training the T2I model
#51
Epiphqny
closed
3 months ago
0
T2I VQVAE Training Details
#50
alexanderswerdlow
opened
3 months ago
0
Mask guidance, inpaiting and outpaiting
#49
sahil02235
opened
4 months ago
7
Cannot Reproduce LlamaGen-B or L numbers using provided models
#48
vkramanuj
closed
4 months ago
1
T2I performance on mscoco
#47
HalvesChen
opened
4 months ago
1
FID results of GPT-L and GPT-1B on 256*256 images
#46
LutingWang
opened
4 months ago
3
KeyError: 'optimizer'
#45
sugary199
opened
4 months ago
4
Can LlamaGen predict a [EOS] token when inferencing?
#44
luminousking
opened
4 months ago
6
Test
#43
Ryankwon03
closed
4 months ago
0
Do you try class 2 Image generation with the image resolution of 512X512?
#42
OliverRensu
opened
4 months ago
0
Difficulty in reproducing results with pre-trained weights
#41
Rishit-dagli
opened
4 months ago
1
tokenizer of 4 dim
#40
DidiD1
opened
4 months ago
0
Training Results
#39
Huage001
opened
4 months ago
4
Questions about the results of your experiment.
#38
potatowarriors
opened
4 months ago
2
T2I Data
#37
HalvesChen
closed
4 months ago
0
你好,vq_ds16_c2i_training.pt 在私有的通用图片上finetune效果变差了,就试了在imgenet数据集接着finetune,效果也变差了,想问一下是哪里出问题了?
#36
353xiong
opened
5 months ago
1
Questions about the discriminator
#35
Doctor-James
closed
4 months ago
1
FID Evaluation not matching paper results for VQ-16 checkpoint
#34
vkramanuj
closed
4 months ago
3
Text embedding inject
#33
daiyixiang666
closed
5 months ago
2
Inquiry about the OpenImages dataset
#32
RobertLuo1
closed
5 months ago
2
Question about why not try using image tokenizer and a ready made llama3 etc LLM model with lora?
#31
lucasjinreal
opened
5 months ago
9
Question about text-conditional generation.
#30
Yangr116
closed
5 months ago
2
你好,我使用您的代码训练了我自己的数据集,但是图片变得特别模糊
#29
gzhuinjune
opened
5 months ago
9
Train script
#28
potatowarriors
opened
5 months ago
1
Discriminator is not training properly?
#27
ThisisBillhe
closed
5 months ago
4
Which parameters are trainable? Are the encoder and decoder in VQGAN fixed? Is the llama fixed?
#26
tanshuai0219
opened
5 months ago
1
[Feature] Inpainting script
#25
kabachuha
opened
5 months ago
0
[Feature] ControlNet support via process similar to PixArt's ControlNet-Transformer
#24
kabachuha
opened
5 months ago
1
Next