-
@csukuangfj
The code for this client is in
https://github.com/k2-fsa/sherpa/tree/master/triton/client/client.py
After my test, if you do not use multi-process mode, that is, send data to the …
-
This is not identical to the other post with the same title.
I get this error when running any stable diffusion XL model, and a similar (but shorter) error when I run any 1.0 model. It throws the e…
-
# KServe: A Robust and Extensible Cloud Native Model Server
## Related Issues
* #21
## Article Source
* [KServe: A Robust and Extensible Cloud Native Model Server](https://thenewstack.io/kser…
-
**Description**
When cycling through the `load model` -> `infer` -> `unload model` scenario we observe a GPU memory leak.
This only happens when models are in Torchscript format. There is no leak…
-
At the moment the rocm sdk builder stack is providing the base for integrating and using the many nice ML projects but does not in itself include them.
Some of the projects like openai whisper are…
-
```
#include
int main()
{
for (size_t i = 0; i < 10; i++)
{
std::string modelPath = std::string("./model/model.onnx");
Ort::Env env;
Ort::Session session = O…
-
One very large Triton kernel cannot be load correctly thru the L0 API.
Got the error code `0x78000011` from L0 API `zeKernelCreate`.
```
ZE_RESULT_ERROR_INVALID_KERNEL_NAME = 0x78000011, ///< [Va…
-
Seems that OpenAI server does not have the former pre-built llvm package on centOS. Thus current CI would complain the following:
```
# Build Triton Wheel
downloading and extracting https://githu…
-
### System Info
I've converted Llama 3 using TensorRT-LLM's convert_checkpoint script, and am serving it with the inflight_batcher_llm template. I'm trying to get diverse samples for a fixed input,…
-
### Issue type
Bug
### Have you reproduced the bug with TensorFlow Nightly?
Yes
### Source
binary
### TensorFlow version
tf 2.16.2
### Custom code
No
### OS platform and …