Open ymzlygw opened 1 year ago
I think you can add config["mid_block_type"] = "UNetMidBlock3DCrossAttn"
after https://github.com/guoyww/AnimateDiff/blob/e2590df10123c11d25c7145ac239902e89e2061c/animatediff/models/unet.py#L468 to allow you to load new models.
This issue happens because old models do not have mid_block_type
in their model configs so that they use the default value in animatediff/models/unet.py. However, new models have UNetMidBlock2DCrossAttn
as mid_block_type
.
Hi @ymzlygw, I would like to inquire about your experience using Diffuser v0.18 in a project. Did you able to successfully use this version of Diffuser in your project?
I am currently trying to use the project with Diffuser v0.17, but I have run into several issues with mismatched functions and classes. I was wondering if you have encountered similar issues and if you have any advice on how to resolve them.
I was able to use with the latest version of Diffusers. I deleted a bunch of code in this repository for the LoRA, then in VersatileDiffusion
I got rid of the xformers stuff, and there are a few calls you have to change to head_to_batch_dim
and batch_to_head_dim
. For the attention functions, there were too many being created so they refactored them into attention processors. However I can't figure out how to refactor AnimateDiff to use these attention processors, so I copy pasted the old code below and added it to VersatileDiffusion
def _attention(self, query, key, value, attention_mask=None):
if self.upcast_attention:
query = query.float()
key = key.float()
attention_scores = torch.baddbmm(
torch.empty(query.shape[0], query.shape[1], key.shape[1], dtype=query.dtype, device=query.device),
query,
key.transpose(-1, -2),
beta=0,
alpha=self.scale,
)
if attention_mask is not None:
attention_scores = attention_scores + attention_mask
if self.upcast_softmax:
attention_scores = attention_scores.float()
attention_probs = attention_scores.softmax(dim=-1)
# cast back to the original dtype
attention_probs = attention_probs.to(value.dtype)
# compute attention output
hidden_states = torch.bmm(attention_probs, value)
# reshape hidden_states
hidden_states = self.batch_to_head_dim(hidden_states)
return hidden_states
def _sliced_attention(self, query, key, value, sequence_length, dim, attention_mask):
batch_size_attention = query.shape[0]
hidden_states = torch.zeros(
(batch_size_attention, sequence_length, dim // self.heads), device=query.device, dtype=query.dtype
)
slice_size = self._slice_size if self._slice_size is not None else hidden_states.shape[0]
for i in range(hidden_states.shape[0] // slice_size):
start_idx = i * slice_size
end_idx = (i + 1) * slice_size
query_slice = query[start_idx:end_idx]
key_slice = key[start_idx:end_idx]
if self.upcast_attention:
query_slice = query_slice.float()
key_slice = key_slice.float()
attn_slice = torch.baddbmm(
torch.empty(slice_size, query.shape[1], key.shape[1], dtype=query_slice.dtype, device=query.device),
query_slice,
key_slice.transpose(-1, -2),
beta=0,
alpha=self.scale,
)
if attention_mask is not None:
attn_slice = attn_slice + attention_mask[start_idx:end_idx]
if self.upcast_softmax:
attn_slice = attn_slice.float()
attn_slice = attn_slice.softmax(dim=-1)
# cast back to the original dtype
attn_slice = attn_slice.to(value.dtype)
attn_slice = torch.bmm(attn_slice, value[start_idx:end_idx])
hidden_states[start_idx:end_idx] = attn_slice
# reshape hidden_states
hidden_states = self.batch_to_head_dim(hidden_states)
return hidden_states
With this it should be enough to get you up and running
updated to diffusers 0.20.1, note that codes have been reconstruced by me and was designed not for beginners 🙂https://github.com/ykk648/AnimateDiff
Hi, thanks for your good work! But I found that the version of diffusers(0.11) is too old .Now latest diffusers is 0.18. Old diffusers version cause issue when I using it: convert many model.ckpt to diffusers model and do inference can cause architecture error: like