OpenSora-PKU v1.1: training speed improvement - Githubissues

mindspore-lab / mindone

one for all, Optimal generator with No Exception

Apache License 2.0

329 stars 63 forks source link

OpenSora-PKU v1.1: training speed improvement #557

Closed wtomin closed 16 hours ago

wtomin commented 1 week ago

To improve speed, some modifications are made:

Skip attention_mask_compress if compress_kv_factor is 1;
Set vae to fp16 in train_t2v.py using the argument vae_precision;
Using vae_keep_gn_fp32 to allow nn.GroupNorm in custom_fp32_cells. Defaults to False. It is verified by its inference quality. The training experiment with this commit is visually fine.

Some minor changes:

Printing message refinement.
ProfilerCallback changes to ProfilerCallbackEpoch