issues
search
showlab
/
Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
https://arxiv.org/abs/2408.12528
Apache License 2.0
1.04k
stars
44
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
About the formulation of classifier-free guidance generation
#53
PoTsui99
opened
4 days ago
2
Request for external caption files
#52
mmderakhshani
opened
6 days ago
5
关于Loss的疑问
#51
tulvgengenr
opened
1 week ago
1
Training Loss Values.
#50
mmderakhshani
opened
2 weeks ago
10
Small bug in Showo.mmu_generate?
#49
CladernyJorn
opened
3 weeks ago
2
代码中会把t2i、llm、mmu三部分的数据集混合起来训练
#48
sherlockma11
opened
4 weeks ago
8
代码中会把t2i、llm、mmu三部分的数据集混合起来训练?
#47
sherlockma11
opened
4 weeks ago
0
生成只能用magvit吗
#46
sherlockma11
opened
4 weeks ago
1
二维码又过期啦
#45
guoti777
closed
4 weeks ago
4
About checkpoints to be used by finetune
#44
trmzpi02
opened
1 month ago
1
二维码又过期啦
#43
sunzhens
opened
1 month ago
5
lora
#42
Zhao-ZD
opened
1 month ago
2
FileNotFoundError
#41
Hannieliao
opened
1 month ago
2
二维码过期啦~!
#40
Strike1999
opened
1 month ago
1
input_ids_mmu not appended to input_ids in train_w_clip_vit.py
#39
lzn87
closed
1 month ago
0
About multimodal sequence input
#38
tulvgengenr
opened
1 month ago
4
Import BUG
#37
Masaaki-75
closed
2 weeks ago
4
Evaluation on NLP tasks and training time
#36
KebinWu
opened
1 month ago
1
Generation inference with interleaved input
#35
ys-zong
opened
1 month ago
2
Question about SHOW-O's CLIP version
#34
hills-code
opened
2 months ago
3
Questions about generation quality
#33
xizaoqu
opened
2 months ago
1
Omni-Attention Implementation
#32
ChocoWu
opened
2 months ago
1
Does Show-o directly complete generation in pixel space?
#31
Delicious-Bitter-Melon
opened
2 months ago
3
Please can you elaborate on the experiemntal setups for Table 4?
#30
zhaoyanpeng
closed
2 months ago
2
Dataset Preparation Script
#29
mmderakhshani
opened
2 months ago
3
multimodal input -> image output
#28
Redtides0
closed
2 months ago
0
Does show-o support multimodal-in multimodal-out?
#27
URRealHero
opened
2 months ago
6
微信二维码过期了,麻烦更新下,谢谢
#26
tgyy1995
closed
2 months ago
1
图像生成推理问题
#25
william-ljz
opened
2 months ago
1
What are the meaning of special tokens
#24
Doctor-James
closed
2 months ago
1
ImportError: cannot import name 'SAFE_WEIGHTS_INDEX_NAME' from 'diffusers.utils'
#23
kurolykin
opened
2 months ago
3
Question about the training of MAGVIT-v2
#22
RobertLuo1
closed
2 months ago
4
No module named 'parquet.parquet_dataset'
#21
mrswang1
opened
2 months ago
2
No module named 'parquet.parquet_dataset'
#20
mrswang1
closed
2 months ago
0
The reason of continuous feature is better than discrete feature is before the codebook size is small?
#19
dongzhuoyao
closed
2 months ago
1
Created inference notebook
#18
Hasnat79
closed
2 months ago
1
Step by Step Inference notebook
#17
Hasnat79
closed
2 months ago
2
Local dev
#16
Hasnat79
closed
2 months ago
0
Add link to models
#15
NielsRogge
closed
3 months ago
0
update README.md
#14
Doctor-James
closed
2 months ago
0
Impact of Various Representations for Multimodal Understanding
#13
Doctor-James
closed
2 months ago
1
Comparison with Transfusion
#12
NickGao96
opened
3 months ago
1
Keyframes generation inference code
#11
qqphung
opened
3 months ago
4
runtime error
#10
junwenxiong
opened
3 months ago
1
env issue
#9
TT2TER
opened
3 months ago
6
FlexAttention example (for `mmu_vit` mask)
#8
Chillee
opened
3 months ago
4
Can Flash Attention be used?
#7
wusize
opened
3 months ago
1
How's the benchmark score on understanding?
#6
MonolithFoundation
closed
3 months ago
1
GPU
#5
yuzhongruicn
closed
3 months ago
2
这个diffusion在哪里体现的??
#4
Robootx
opened
3 months ago
4
Next