Implementation of compression frame patch embedding (Fig. 3b)

paulchhuang commented 4 months ago

Hi, Thanks for the great work. I have a few questions:

By default which "patch embedding" is used? Fig.3(a) or (b)?
Is there a parameter to switch between (a) and (b) in a config file?
I'd like to take a look at the implementation of (b) -- compression frame patch embedding. I see PatchEmbed several places and they are from different libs: sometimes from diffuser sometimes from timm. Do you have a pointer to the code where Fig.3(b) is implemented?

maxin-cn commented 4 months ago

Hi, Thanks for the great work. I have a few questions:

By default which "patch embedding" is used? Fig.3(a) or (b)?

Is there a parameter to switch between (a) and (b) in a config file?

I'd like to take a look at the implementation of (b) -- compression frame patch embedding. I see PatchEmbed several places and they are from different libs: sometimes from diffuser sometimes from timm. Do you have a pointer to the code where Fig.3(b) is implemented?

Latte uses Fig.3 (a) by default.
This repo does not provide (b).
Please refer to here.

paulchhuang commented 3 months ago

Thanks for the prompt reply and pointers.

Vchitect / Latte

Implementation of compression frame patch embedding (Fig. 3b) #52