triton-server Search Results

1000+ results
for triton-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

kserve/kserve #1529

gpu metrics collect

**Environment:** - Istio Version: 1.3.6 - Knative Version: 0.15.0 - KFServing Version: 0.4 I followed the [metrics installation](https://github.com/kubeflow/kfserving/tree/v0.4.0/docs/samples/…

edenbuaa updated 2 years ago
3
ELS-RD/transformer-deploy #60

Out of memeory error for batch size more than 1 for T5 model…

hey, first of all, thanks for creating this amazing library! I'm following your T5 implementation with trt, https://github.com/ELS-RD/transformer-deploy/blob/b52850dce004212225edcaa7b80fccc311398…

Ki6an updated 2 years ago
15
tritonmc/Triton #132

Velocity support

Support for Velocity proxy.

leonardo-dgs updated 2 years ago
2
triton-inference-server/tensorrtllm_backend #256

How to make the server call tensorrt_llm/examples/run.py?

I've followed the instruction https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/baichuan.md to run Baichuan2-7b-Chat. But for exactly the same engine, the outputs are …

shil3754 updated 6 months ago
4
abseil/abseil-cpp #1769

[Bug]: CMakeLists.txt forces full rebuild on every install

### Describe the issue In its current state, the CMakeLists.txt of abseil inconditionally bypasses CMake's target longevity rules, and rewrites the file `options.h` every time CMake's consider the …

nicolasnoble updated 6 days ago
2
vllm-project/vllm #9172

[Bug]: Port binding keep failing due to unnecessary code

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N…

James4Ever0 updated 1 month ago
2
k2-fsa/sherpa #409

streaming_pruned_transducer_statelessX removed

I'm currently using `sherpa/bin/streaming_pruned_transducer_statelessX/streaming_server.py` with all the underlying c++ code for modified_beam_search (`RnntConformerModel`, `StreamingModifiedBeamSearc…

kolyaflash updated 1 year ago
1
NVIDIA-Merlin/systems #207

[BUG] Robust 2-stage recommender system pipeline

### Bug description The unit test of the 2-stage recommender system pipeline is shaky due to multiple reasons: - user_id sent to triton inference server does not exist in FEAST storage - FIASS cann…

bschifferer updated 1 year ago
3
triton-inference-server/pytriton #44

Support Mac installation

It would be great to be able to install pytriton on Macs for ease-of-development. Even with the lack of CUDA support for Macs, being able to develop using only the CPU would be a real time saver. A…

zbloss updated 4 months ago
16
nvidia-riva/nemo2riva #36

Conformer CTC converted with nemo2riva 2.13.1 deployed on Ri…

I have a conformer CTC model built with the NeMo framework (https://github.com/NVIDIA/NeMo), which can be normally converted and deployed with Riva 2.11.0. However, if I convert the same NeMo file to …

itzsimpl updated 9 months ago
1

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for triton-server

1000+ results
for triton-server