-
### What happened?
When using `llama-server`, the output in the UI can't be easily selected or copied until after text generation stops. This may be because the script replaces all the DOM nodes of…
-
In Plotly Dashboard, we are not able to add RTSP stream as Video input.
127.0.0.1 - - [25/Sep/2024 05:49:56] "GET /status HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2024 05:49:56] "GET /resources HTTP/…
-
@dusty-nv thanks for NanoLLM for CUDA=12.6 - works well!!
However, when I invoke it with:
```
sudo jetson-containers run $(autotag nano_llm) \
python3 -m nano_llm.agents.video_query --api=…
-
Hello,when i use the test inference code for single image,i put the code into the Fastapi server.
the model start normal,but when receive the post.the holy code occurs Segmentation fault (core dumped…
-
As a model developer I want to be able to create a docker image that is capable of running my model, so that my model can be executed independently by third parties.
### Acceptance criteria
- the co…
-
## User story
As a customer,
I want to launch an app implementing Triton Inference Server
In order to
deploy my models in production with optimisation and high availability.
## Acceptance …
-
### System Info
- Ubuntu 20.04
- NVIDIA A100
### Who can help?
@kaiyux
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported …
-
Hi,
I am interested in implementing MLflow in my project, where I have built several speech models and NLP-based machine translation (MT) models. I am looking to incorporate continuous training and…
-
I tried to follow the Austira example but failed.
The error message is `RuntimeError: DataLoader worker (pid 4071827) is killed by signal: Bus error. It is possible that dataloader's workers are out …
-
### Proposal to improve performance
Hi thank you for the great project! I would like to use vllm to run inference to test models on datasets. For example, say evaluating whether a prompt is good or…