Open ecyht2 opened 2 weeks ago
Add support for RPC backend for GGML. This allows for distributed inferencing on multiple GPU over the network. Not so important now as the current models are still kinda small but would be interesting to have this feature.
First we need partial/split offloading. (but yea, rpc would be one of the usable backends)
Add support for RPC backend for GGML. This allows for distributed inferencing on multiple GPU over the network. Not so important now as the current models are still kinda small but would be interesting to have this feature.