inference-engines Search Results

1000+ results
for inference-engines

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

allegroai/clearml-serving #17

ClearML serving design v2

### ClearML serving design document v2.0 **Goal: Create a simple interface to serve multiple models with scalable serving engines on top of Kubernetes** Design Diagram (edit [here](https://excalid…

bmartinn updated 2 years ago
8
janhq/jan #3690

epic: Jan's path to cortex.cpp?

## Goal - Jan should be able to seamlessly move from Nitro to cortex.cpp - What is the scope of change? - Different inference extensions? (e.g. `nitro-extension`, and `cortex-extension`?) -…

dan-homebrew updated 3 days ago
8
NVIDIA/TensorRT-LLM #802

Failing to inference multi-GPU Llama engine

**Env:** - Container: nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3 - TensorRT-LLM release: 0.7.1 - TRT-LLM backend repo tag: v0.7.1 - Model: Llama-2-70b - tritonserver deployed on 2 A10…

manarshehadeh updated 8 months ago
1
NVIDIA/TensorRT #3868

Reuse engine for multiple consequent runs

## Description Build engines for SDXL. Then init pipeline. And do several runs. At the first run I get good picture, but the second run gives all grey image. I've added controlnet and ip-adapte…

KyriaAnnwyn updated 3 months ago
7
janhq/jan #3735

architecture: Jan Mobile

## Goal - Jan has a mobile client that runs local models

dan-homebrew updated 1 week ago
1
onnx/tensorflow-onnx #2183

Support for `LayerNormalization` from ONNX opset 17

When converting a `tensorflow.keras.layers.LayerNormalization` layer to ONNX, `tf2onnx` currently decomposes layer normalizations into rather complex subgraphs with batch norms and more basic building…

pwuertz updated 1 year ago
3
opensearch-project/ml-commons #2891

[RFC] Asynchronous Offline Batch Inference and Ingestion to …

### Problem Statement Nowadays remote model servers like AWS SageMaker, BedRock, or OpenAI, Cohere, etc all support batch predict APIs, which allow users to send large amount of synchronous request…

Zhangxunmt updated 2 days ago
1
triton-inference-server/server #7374

Prebuilt Triton Server 24.05-trtllm-python-py3 does not have…

**Description** According to the Framework matrix (https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#framework-matrix-2024), 24.05 is supposed to support TensorRT 10.0.6.1. Th…

CarterYancey updated 2 months ago
9
PaddlePaddle/Paddle #67774

飞腾2000+，昆仑芯R200，麒麟V10环境下编译paddlepaddle报错 unsupported relocat…

### 问题描述 Issue Description 在飞腾2000+，昆仑芯R200，麒麟V10环境下编译paddlepaddle报错，报错信息如下： /usr/bin/ld: /usr/lib64/libcrypto.a(sha1-armv8.o): relocation R_AARCH64_PREL64 against symbol `OPENSSL_armcap_P' which ma…

czp97 updated 1 week ago
7
sqlc-dev/sqlc #2742

feat(compiler): Improve type inference for parameters compar…

### What do you want to change? We would like to improve the type inference of parameters compared to constants only. Currently, it is absolutely inferred that `interface{}`. https://play.sqlc.…

orisano updated 1 year ago
2

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for inference-engines

1000+ results
for inference-engines