Built-in support for (custom?) decryption of model weights

triton-inference-server / onnxruntime_backend

The Triton backend for the ONNX Runtime.

BSD 3-Clause "New" or "Revised" License

134 stars 57 forks source link

Here's a demonstration of adding decryption of the ONNX model weights at loading time:

https://github.com/WoodieDudy/onnxruntime_backend/pull/1

But maybe the better way would be to implement this as allowing the user to specify a path to a custom .so-file in the triton model config or alternatively implement this via calling in the backend code stub I/O hooks which could then be overridden by the user with LD_PRELOAD'ed custom impl of these hooks. Then these hooks could implement loading model weights from some S3 / custom FS path or do custom decryption or something else.

Of course this approach can become more complicated if the model weights are accessed via mmap-ing of the weights.

triton-inference-server / onnxruntime_backend

Built-in support for (custom?) decryption of model weights #279