inference-engine Search Results

langchain4j/langchain4j #2051

[FEATURE] Llama3 pure java inference engine ?

**Is your feature request related to a problem? Please describe.** I was taking a look into Karapthy's lama3.c single file and found something similar in java https://github.com/mukel/llama3.j…

omarmahamid updated 1 week ago

exo-explore/exo #234

[BOUNTY - $500] JAX Inference Engine

Why JAX? Read this: https://neel04.github.io/my-website/blog/pytorch_rant/ The deliverable is a `JAXInferenceEngine` that can run Llama.

AlexCheema updated 3 days ago

DS4SD/docling #425

Improve Deployment Efficiency: Integrate ONNX Runtime as Inf…

### Requested feature First of all, congrats on the amazing work ! I have two improvement ideas that might help simplify using this library in a wider range production workloads: * Supp…

CVxTz updated 15 hours ago

disler/poc-realtime-ai-assistant #1

Local model(s) and inference engines

Hey man! do you know whether there is already someone or some group working on 'cloning' this for a local-only solution? Thanks!

ChristianWeyer updated 1 month ago

Peterande/D-FINE #76

trt_inf.py tensorrt推理加速出现orig_target_sizes不匹配的错误无法成功推理

1. 我在训练的时候，将图像大小由640改为了1280，修改了图像大小相关的适配为1280,其他的没改,成功训练完毕 2. 训练好之后保存为best_stg2.pth，然后用官方的代码和命令转成best_stg2.onnx 3. 再将onnx用官方的命令转成engine文件，trtexec --onnx="best_stg2.onnx" --saveEngine="best_stg2.engi…

JunerLee updated 3 days ago

exo-explore/exo #167

[BOUNTY - $500] Llama.cpp inference engine

- it should automatically detect the best device to run on - We should require 0 manual configuration from the user, by default llama.cpp for example requires specifying the device

AlexCheema updated 1 month ago

exo-explore/exo #457

V100 GPUs with architecture sm_70 are not supported

Hi, I tried the latest commit of main b6945224fadc79ff11f3c58465380a6d4294962e on a V100 gpu using cuda 12.2 and 12.4 on a V100 GPU and i got the following error ```bash epo_revision': 'main'…

mherkazandjian updated 21 hours ago

Dan-wanna-M/formatron #24

Support MLX for Mac

Hi, would like to ask if there is plan to support mlx inference engine. It is the current fastest engine for mac os and i dont think any of the existing other library that support it for format cons…

remichu-ai updated 4 days ago

NVIDIA/TensorRT #4243

Optimization of fast-reid infernce using tensorrt

Reading engine from file /content/engine/yolo11x_fp16.engine Total Inference Time : 17.17 Total Frame processed : 750 Average Inference FPS : 43.69 Total Feature Time : 75.53 Average feature FPS : 9.9…

smrutiranjanmohapatra updated 1 week ago

NVIDIA/TensorRT #4241

How can I optimize multi-batch and parallel inference in Ten…

## Description I am encountering performance bottlenecks while running multi-threaded inference on high-resolution images using TensorRT. The model involves breaking the image into patches to manage…

Nizam-Drongo-Space updated 1 week ago

1000+ results for inference-engine

1000+ results
for inference-engine