The purpose of this container is to build istio-sidecar.deb. This debian package is then installed within a virtual machine which I will call the gpu service. One or more gpu service virtual machines are started on a host and then connect to https://api.computelify.com. The service macine also includes vllm to provide inference. Our other packages, such as cloud-hypervisor and virtiofsd provide the infrastructure to run the gpu services.
Once connected, the gpu service will be authz/authn to a specific namespace or namespaces. The GPU service will then provide an openai API on api.computelify.com. Its possible to add custom domains easily.
Old school models of this approach involve writing custom https and http man in the middle code which is fragile and complex. Instead, we wrap that complexity in a robust implementation of a dataplane provided by envoy and a control plane provided by Istio.
The purpose of this container is to build istio-sidecar.deb. This debian package is then installed within a virtual machine which I will call the gpu service. One or more gpu service virtual machines are started on a host and then connect to https://api.computelify.com. The service macine also includes vllm to provide inference. Our other packages, such as cloud-hypervisor and virtiofsd provide the infrastructure to run the gpu services.
Once connected, the gpu service will be authz/authn to a specific namespace or namespaces. The GPU service will then provide an openai API on api.computelify.com. Its possible to add custom domains easily.
Old school models of this approach involve writing custom https and http man in the middle code which is fragile and complex. Instead, we wrap that complexity in a robust implementation of a dataplane provided by envoy and a control plane provided by Istio.
Istio offers these benefits:
Next up, inference via vllm within a gpu service virtual machine conneced to our ingress.