Closed daanelson closed 1 year ago
one thing that's coming out of this, at least imo, is that a lot of the config we're setting really lives at the engine level - that's especially true if we have special requirements for cog.yaml
per engine, etc.
are we still going to merge this?
replaced by #47
WIP: Refactor of llama inference into separate engines which can serve predictions with different underlying inference code.
This has:
engine
modulesThis does not yet have: