HL-hanlin / Ctrl-Adapter

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
https://ctrl-adapter.github.io/
Apache License 2.0
342 stars 14 forks source link

关于该方法在实际应用的疑问 #15

Open wly-ai-bj opened 1 month ago

wly-ai-bj commented 1 month ago

看了几个例子,有个疑问,以canny为例,

需要先有所有frame的canny结果,再以此作为condition生成新的video。但是,实际情况这个condition是无法获取的。

而且,如果看起来这个canny结果也是根据其他视频提取的,那这样的话,这个方法的在实际应用时的意义在哪呢?

HL-hanlin commented 1 month ago

Thanks for raising this question!

The scope in which our Ctrl-Adapter can be applied is similar to how ControlNet is used for controllable image generation. We can categorize the input types (e.g., canny, scribbles) into two categories.

For control conditions (e.g., canny edge) that are hard to be "hand-drawn" by users directly, these condition images/frames are usually extracted from existing images/videos. They are then used as control conditions for image/video style transfer tasks with newly provided prompts.

On the other hand, for control conditions that can be provided by users directly without extraction from existing images/videos (e.g., user scribbles, keypoints, line art, MLSD), we can utilize them directly for controllable image/video generation.

I hope this answer helps!

感谢您提出这个问题!

我们 Ctrl-Adapter 的应用范围类似于 ControlNet 在可控图像生成中的应用。我们可以将输入类型(如 canny,scribble等)分为两类。

对于那些难以由用户直接“手绘”的控制条件(例如 canny edge),这些条件图像/帧通常是从现有的图像/视频中提取的。然后,它们和新的prompt一起使用, 就可以实现图像/视频的风格迁移。

另一方面,对于那些可以由用户直接提供而无需从现有图像/视频中提取的控制条件(例如 scribbles,keypoints,lineart,MLSD 等),我们可以直接利用它们进行可控图像/视频生成。

希望这个答案对您有帮助!