ljleb / prompt-fusion-extension

auto1111 webui extension for all sorts of prompt interpolations!
MIT License
265 stars 16 forks source link

using multiple successive interpolations hangs the sampler #20

Open ljleb opened 1 year ago

ljleb commented 1 year ago

due to the fact that we use a multidimentional tensor of embeddings made out of all the possible permutations of control points of all interpolation expressions to interpolate a prompt, using more than 10 interpolations will take a long time to schedule.

I'm not sure how to fix this. Something could be done so that only the control points needed for sampling are passed to modules.prompt_parser. However this would at best make the time complexity = O(2^n), which doesn't solve the combinatorial issue.

PladsElsker commented 1 year ago

To fix this, we HAVE to use another method to find the embedding at step t (so no regression on some sort of "nested" curved hyperplane).

One thing we can try to do is to formalize what we're doing and try to see if it translates into a simpler form.

Something encouraging to realize is that the nested planes are resolvable linearly (nested planes only need to be resolved once, and in order of leaf -> root), so the only real problem is the exponential complexity of each individual plane.

ljleb commented 1 year ago

In the end, at the moment, we are using the leafs of the prompt tensor as control points for an n-dimentional bezier surface. Maybe knowing this can help us look up and experiment with other representations of n-dimentional surfaces.

I know we discussed this irl, I just want to leave a note about it.

ljleb commented 1 year ago

Sometimes I use interpolations that share step numbers and curve type. IIUC, all axes that have the same shape (which means they share step numbers and interpolation function) can be simplified together into 1 axis without deforming the resulting interpolation curve.

In this case, variables can also be used to rewrite the interpolations (that have the same shape) as a single interpolation. It is a corner case but optimizing it should allow more prompts to be written without hanging the runtime.

We could also go further and check:

Time complexity of the optimization is O(n) with n = amount of consecutive 2 step numbers in linear interpolations + amount of consecutive 4 step numbers in catmull interpolations + amount of bezier interpolations in prompt. To achieve it, we can use a map of step number spans to tensor locations or something. Then, we build a new prompt tensor out of this information.