showlab Show-o issues - Githubissues

showlab / Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

https://arxiv.org/abs/2408.12528

Apache License 2.0

1.04k stars 44 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

About the formulation of classifier-free guidance generation

#53 PoTsui99 opened 4 days ago
2
Request for external caption files

#52 mmderakhshani opened 6 days ago
5
关于Loss的疑问

#51 tulvgengenr opened 1 week ago
1
Training Loss Values.

#50 mmderakhshani opened 2 weeks ago
10
Small bug in Showo.mmu_generate？

#49 CladernyJorn opened 3 weeks ago
2
代码中会把t2i、llm、mmu三部分的数据集混合起来训练

#48 sherlockma11 opened 4 weeks ago
8
代码中会把t2i、llm、mmu三部分的数据集混合起来训练？

#47 sherlockma11 opened 4 weeks ago
0
生成只能用magvit吗

#46 sherlockma11 opened 4 weeks ago
1
二维码又过期啦

#45 guoti777 closed 4 weeks ago
4
About checkpoints to be used by finetune

#44 trmzpi02 opened 1 month ago
1
二维码又过期啦

#43 sunzhens opened 1 month ago
5
lora

#42 Zhao-ZD opened 1 month ago
2
FileNotFoundError

#41 Hannieliao opened 1 month ago
2
二维码过期啦~！

#40 Strike1999 opened 1 month ago
1
input_ids_mmu not appended to input_ids in train_w_clip_vit.py

#39 lzn87 closed 1 month ago
0
About multimodal sequence input

#38 tulvgengenr opened 1 month ago
4
Import BUG

#37 Masaaki-75 closed 2 weeks ago
4
Evaluation on NLP tasks and training time

#36 KebinWu opened 1 month ago
1
Generation inference with interleaved input

#35 ys-zong opened 1 month ago
2
Question about SHOW-O's CLIP version

#34 hills-code opened 2 months ago
3
Questions about generation quality

#33 xizaoqu opened 2 months ago
1
Omni-Attention Implementation

#32 ChocoWu opened 2 months ago
1
Does Show-o directly complete generation in pixel space？

#31 Delicious-Bitter-Melon opened 2 months ago
3
Please can you elaborate on the experiemntal setups for Table 4?

#30 zhaoyanpeng closed 2 months ago
2
Dataset Preparation Script

#29 mmderakhshani opened 2 months ago
3
multimodal input -> image output

#28 Redtides0 closed 2 months ago
0
Does show-o support multimodal-in multimodal-out?

#27 URRealHero opened 2 months ago
6
微信二维码过期了，麻烦更新下，谢谢

#26 tgyy1995 closed 2 months ago
1
图像生成推理问题

#25 william-ljz opened 2 months ago
1
What are the meaning of special tokens

#24 Doctor-James closed 2 months ago
1
ImportError: cannot import name 'SAFE_WEIGHTS_INDEX_NAME' from 'diffusers.utils'

#23 kurolykin opened 2 months ago
3
Question about the training of MAGVIT-v2

#22 RobertLuo1 closed 2 months ago
4
No module named 'parquet.parquet_dataset'

#21 mrswang1 opened 2 months ago
2
No module named 'parquet.parquet_dataset'

#20 mrswang1 closed 2 months ago
0
The reason of continuous feature is better than discrete feature is before the codebook size is small?

#19 dongzhuoyao closed 2 months ago
1
Created inference notebook

#18 Hasnat79 closed 2 months ago
1
Step by Step Inference notebook

#17 Hasnat79 closed 2 months ago
2
Local dev

#16 Hasnat79 closed 2 months ago
0
Add link to models

#15 NielsRogge closed 3 months ago
0
update README.md

#14 Doctor-James closed 2 months ago
0
Impact of Various Representations for Multimodal Understanding

#13 Doctor-James closed 2 months ago
1
Comparison with Transfusion

#12 NickGao96 opened 3 months ago
1
Keyframes generation inference code

#11 qqphung opened 3 months ago
4
runtime error

#10 junwenxiong opened 3 months ago
1
env issue

#9 TT2TER opened 3 months ago
6
FlexAttention example (for `mmu_vit` mask)

#8 Chillee opened 3 months ago
4
Can Flash Attention be used?

#7 wusize opened 3 months ago
1
How's the benchmark score on understanding?

#6 MonolithFoundation closed 3 months ago
1
GPU

#5 yuzhongruicn closed 3 months ago
2
这个diffusion在哪里体现的？？

#4 Robootx opened 3 months ago
4