Closed monorimet closed 5 months ago
should we adapt to just use the turbine llm runner?
should we adapt to just use the turbine llm runner?
sounds OK to me, but will need some work to integrate with the UI might help with maintenance since turbine seems to be the favorite for new features/dev workflows... we just need to have an option to run with SRT and its bleeding edge flags etc
issue filed for the edge case preventing us from running the api test on a small model with externalized weights https://github.com/openxla/iree/issues/16138
Is there a way we can specify the vmfb and safetensor path?
Would be good to put in an option to set self.vmfb_path instead of renaming the target file. Will update in a few.
Specifying vmfb path etc. will take a genuine CLI option interface or an option in the UI, both of which are significant changes and should come in a follow-up.
@dan-garvey I would prefer not to touch the finely balanced prompt handling until SDXL is finished, and we need this patch to clear the turbine CI.
Please file issue for prompt handling, this is how we got where we were in 1.0