inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

EleutherAI/lm-evaluation-harness #1719

When using Accelerate for data parallel inference, using dif…

Hi, @haileyschoelkopf Thank you for your awsome open-source work. We have been evaluating using `lm-eval` and noticed that when using `accelerate` for data parallel inference, the number of GPUs utili…

s1ghhh updated 3 months ago
4
foundation-model-stack/foundation-model-stack #293

torch.export integration with FMS

This issue will serve as a tracking mechanism for all the work being done to get FMS models working with torch.export(). The initial list of work we have identified is the following: - We need a…

ani300 updated 1 month ago
3
NVIDIA-Merlin/Merlin #1070

[QST] How to serve merlin-tensorflow model in Triton Inferen…

# ❓ Questions & Help ## Details Hi, I have been experimenting with an existing TF2 model using the merlin-tensorflow image. This has allowed me to leverage the SOK toolkit for the SparseEmbeddin…

tuanavu updated 12 months ago
1
microsoft/onnxruntime #21276

[TensorRT ExecutionProvider] Cannot infer the model on a GPU…

### Describe the issue In a scenario where multiple GPU devices are available, when selecting the TensorrtExecutionProvider and choosing device_id = 0, the model infers perfectly. However, when usi…

dat58 updated 2 months ago
6
pytorch/PiPPy #1118

Inference freezes when running llama example with pp>2

Hi, I am trying to run the example script provided for llama model for inference only. Since the repository is going through migration and a lot of changes, I went back and install the stable `v0.2…

JamesLYan updated 3 months ago
3
michaelfeil/infinity #328

Get null embedding

### System Info infinity 0.0.53 OS version: linux Model being used: dunzhang/stella_en_1.5B_v5 Hardware used: NVIDIA A100 ### Information - [ ] Docker - [X] The CLI directly via pip ##…

MLlove0402 updated 1 week ago
9
microsoft/onnxruntime #19309

[Build] Handling Multiple ONNX Runtime Sessions Sequentially…

### Describe the issue We have a Flask-based API for running computer vision models (YOLO and classifiers) using ONNX Runtime. The models, originally trained in PyTorch, are converted to ONNX forma…

wadie999 updated 7 months ago
5
miguelvr/trtserver-go #2

Question

Hi, is it possible to run PyTorch model inference (server) with go as well? any projects you know if? Thanks

Arnold1 updated 1 year ago
1
pytorch/serve #3215

Model missing error - KServe - PyTorch

### 🐛 Describe the bug Hello, I would like to ask your help. I am using KServe and would like to deploy a PyTorch model with it. My problem is that I am getting models missing error messages…

Csehpi updated 2 months ago
2
dusty-nv/jetson-containers #569

deepspeed for XTTS

Since the coqui docs recommend the use of `deepspeed` to speed up their XTTS model I wanted to give this a try. To make it work I did the following: - I had to rebuild pytorch with `USE_NCCL=1` be…

hslr4 updated 2 months ago
3

上一页 1...63 64 65 66 67 68 69...100 下一页

1000+ results for inference-server

1000+ results
for inference-server