-
**Describe the bug**
Inference hangs when using A770
**Logs**
server logs
```
[2024-02-23 15:54:44.147][2184239][serving][error][modelinstance.cpp:1193] Async caught an exception Internal infer…
-
ERROR:
`λ localhost /work/Serving/build-server-npu {v0.9.0} make TARGET=ARMV8 -j16
[ 3%] Built target extern_gflags
[ 9%] Built target extern_snappy
[ 9%] Built target extern_zlib
[ 13%] Perfo…
-
Hello.
I am writing to inquire about the PyTorch version used in the Triton Inference Server 24.01 release.
Upon reviewing the documentation, I noticed that Triton 24.01 includes PyTorch version…
-
**Describe the feature you'd like**
Let's say I have my custom model hosted at some remote server having an inference API endpoint. The inference API endpoint takes an input in a particular JSON form…
-
Hello,
I have trained a model in mmsegmentation. (Pointrend)
I can use this model to inference with jit inference. When I send to inference request to Triton inference server, I got an error.
…
-
I got it to start on windows and detect other devices, however the windows pc itself is not being shown on other devices running exo, it gets detected as nothing, [] according to debug. On the windows…
-
**Describe the bug**
When running as a non-root user within a container, sagemaker-inference fails to start the multi-model-server. This works when all packages are installed as root, and the entry…
-
It would be nice if we could configure the base url, then people could use offline models via [ollama](https://ollama.com/) or similar tools.
-
Since jetson supports triton inference server, I am considering applying it.
So, I have a few questions.
1. In an environment where multiple AI models are run in Jetson, is there any advantage to …
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a…