Closed geekbeast closed 9 months ago
In discussing this today, it seemed to me we should close this PR. @shschaefer was making the point that adding the kserve
option is overloading the graph encoding value with two decisions: the user not only decides which model they are providing, they are now also telling the implementation which backend to use. As @shschaefer and @geekbeast considered whether a separate "choose which backend" API might be necessary for user that want that kind of control, it occurred to me that we can design that in the future — we can still close this PR now. In fact, I think @geekbeast mentioned that autodetect
is a better option for what he's trying to do.
What do you all think: close this? We should still make the Wasmtime implementation less naive by giving hosts a "choose the backend" function they can plug in (on the host side, at configuration time; not user visible) to choose which backend serves each graph encoding.
cc: @mingqiusun, @shschaefer, @geekbeast, @VestigeJ
Due to the changes in bytecodealliance/wasmtime#6893, we should have an explicit enum entry for kserve backend as the graph encoding is now used to identify the backend and it would conflict with an autodetect feature.