Closed townwish4git closed 3 months ago
diffusers:
pooled_text_embedding = self.text_projection(input)
, where self.text_projection
is a torch.nn.Linear
without bias, which means pooled_text_embedding = input @ self.text_projection.weight.T
MindONE (and stability-AI):
pooled_text_embedding = input @ self.text_projection.weight
, where no transpose op.
so we transpose this parameter manually when convert weights.
What does this PR do?
Parameter
conditioner.embedders.1.model.text_projection
is called differently in MindONE and Diffusers, so it is transposed when pre-trained weight is convertedDETAIL:
diffusers:
pooled_text_embedding = self.text_projection(input)
, whereself.text_projection
is atorch.nn.Linear
without bias, which meanspooled_text_embedding = input @ self.text_projection.weight.T
MindONE (and stability-AI):
pooled_text_embedding = input @ self.text_projection.weight
, where no transpose op.so we transpose this parameter manually when convert weights.
Before submitting
What's New
. Here are the documentation guidelinesWho can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@xxx