Seeing the completion of diffusers#3212 (which was actually implemented in transformers, but whatever), it should be easier to implement clip skip as a feature into SHARK.
However, clip skip is intended to be a "dynamic" variable which, if implemented in the compilation-based SHARK, may require a re-compile every time this variable changes, which is not ideal. If it can be implemented without a re-compile every time the variable changes, the implementation itself should be dead easy in the model wrapper, and would look something like this:
Note here that the clip_skip arg is fed by the user, and would likely come from args.clip_skip, where we store all the other user-defined args from the UI.
If my knowledge of this pipeline & SHARK were any better, I'd do this myself and just submit a PR but here we are.
Dynamic is possible (see Chatbot/LLM implementations) but implementation varies among models. I think this is worth tracking.
@gpetters94 do you have any thoughts on this?
Seeing the completion of diffusers#3212 (which was actually implemented in transformers, but whatever), it should be easier to implement clip skip as a feature into SHARK.
However, clip skip is intended to be a "dynamic" variable which, if implemented in the compilation-based SHARK, may require a re-compile every time this variable changes, which is not ideal. If it can be implemented without a re-compile every time the variable changes, the implementation itself should be dead easy in the model wrapper, and would look something like this:
Note here that the
clip_skip
arg is fed by the user, and would likely come fromargs.clip_skip
, where we store all the other user-defined args from the UI.If my knowledge of this pipeline & SHARK were any better, I'd do this myself and just submit a PR but here we are.