Open IvanFei opened 9 months ago
We take the negative prompt into account since many finetunes suggest specific negative prompts they should be used with. The idea being that when the router encounters a similar prompt and negative prompt it will route to that specific model's layer. Though we have to do some ablation tests and see how much the inclusion of these negative prompts affects the final SegMoE.
thank you for kind reply.
Why not using hidden states of positive hidden states? e.g. hidden_states = intermediate[key][0][0]
Here when using Classifier-free Guidance, positive and negative prompt would form a batch to infer.
Here‘s another question i'd like to ask:
hi,
ref: https://github.com/segmind/segmoe/blob/5fce95320f932aeb0991c9c0c31a3be72dbf7ce8/segmoe/main.py#L1300C13-L1300C26