Open ali-vosoughi opened 6 months ago
Hi Ali, the models are trained to only work for fixed (maximum) numbers of steps -- so if there are less it'll still be fine, will just pad some blank images to the generation. We found most recipes in the ground truth data were less than that. Is that what you mean by freeform? I'm not sure what einops functions you're referring to, but the main method (stacking and unstacking) is implemented with einops but should work for any sequence length.
Hi Sachit,
I noticed you've used some functions in einops.py that seem confusing and might not be suitable for all types of recipes. Ideally, recipes should consist of only 6 steps, and it appears that an input-text length of 7 is somewhat hardcoded. I see these are marked as TODOs, but it would be great to clarify them and advise how we can use it for free-form recipes.
Thanks!