If possible, running the entire suite from a docker container would make it highly portable. This seems somewhat infeasible due to the current issue of the disparate gpu/cpu architectural targets for individual components.
A key element may be Cosmopolitan, a universal binary building library. See llamafile for an example of this already done with llama.cpp
If possible, running the entire suite from a docker container would make it highly portable. This seems somewhat infeasible due to the current issue of the disparate gpu/cpu architectural targets for individual components.
A key element may be Cosmopolitan, a universal binary building library. See llamafile for an example of this already done with llama.cpp