When loading parameter dict, pos_embed can be loaded perfectly. However, if the user accidently set all parameters (including self.pos_embed to trainable, this can be a problem. Because in torch, self.register_buffer("pos_embed", xxx) is not trainable at all. It is not a nn.Parameter.
If change it to:
self.pos_embed = ms.Tensor(pos_embed)
It may raise error or warning when loading parameter dict. Do you have better suggestion?
Added new modules in
mindone/diffusers/
models/`. They are listed below:ImagePositionalEmbeddings
SinusoidalPositionalEmbedding
CombinedTimestepLabelEmbeddings
PixArtAlphaCombinedTimestepSizeEmbeddings
PatchEmbed
AdaLayerNorm
AdaLayerNormZero
These modules are strictly aligned, except the way to handle
self.register_buffer
.For example, torch code :
is changed to ms code:
When loading parameter dict,
pos_embed
can be loaded perfectly. However, if the user accidently set all parameters (includingself.pos_embed
to trainable, this can be a problem. Because in torch,self.register_buffer("pos_embed", xxx)
is not trainable at all. It is not ann.Parameter
.If change it to:
It may raise error or warning when loading parameter dict. Do you have better suggestion?