inference-engines Search Results

janhq/cortex.cpp #1342

idea: Better CLI -h grouping

### Problem Statement I can see why Gab was confused here: #1329 @vansangpfiev Can we use the groupings @dan-homebrew /I orginally suggested? 🙏 Current - Not super accurate bc `chat` shou…

0xSage updated 1 week ago

paulbricman/hypothesis-subspace #18

Ideological Inference Engines

# Ideological Inference Engines Description placeholder [https://paulbricman.com/hypothesis-subspace/?stackedPages=%2Fideological-inference-engines](https://paulbricman.com/hypothesis-subspace/?stac…

utterances-bot updated 2 years ago

InftyAI/llmaz #37

Support OpenAPI

Right now, we support inference engines like vllm for inference, what if people want calling OpenAPIs like chatGPT, it's easy to integrate.

kerthcet updated 2 months ago

NVIDIA/TensorRT-LLM #2251

The problem of repeated output of large models in llama3

Hello everyone, I have a problem and would like to ask for help. After I compile and run the inference code run.py, if I set max_output_len to a small value, the output will be truncated before it is …

qimingyangyang updated 6 days ago

exo-explore/exo #269

Many things to improve or fix

- 3,000 and 1 bugs... - Using CLANG is impossible because it directly brings up any infinity of bugs, under which version is that supposed to work without bugs? Because I'm using 18.1.3 on Ubuntu 2…

darkanubis0100 updated 1 day ago

janhq/cortex.cpp #1165

epic: Cortex Hardware API

## Goal - Cortex has a clear CLI and API to select active hardware - Cortex can list all available hardware - Cortex can activate specific hardware (e.g. CPU-only, or specific GPU) - Cortex can dete…

dan-homebrew updated 2 days ago

janhq/cortex.cpp #1353

epic: Cortex.cpp to support Python?

## Goal - We need a Python runtime to run TTS library - There is an ever-increasing focus on Python for inference - how do we align? - Blocks #1247 ## Tasklist - [ ] Architecture question: how do…

dan-homebrew updated 2 days ago

janhq/cortex.cpp #568

feat: Allow transforming sse chunk response

Problem Currently it is more complicated handling sse output from custom remote inference engines that do not follow the OpenAI spec Success Criteria Ability to transform the response similar to …

grafail updated 1 month ago

kserve/kserve #3736

add Xinfernece ( an inference platform which integrated tran…

/kind feature **Describe the solution you'd like** Hope add [https://github.com/xorbitsai/inference](https://github.com/xorbitsai/inference) as the kserve huggingface LLMs serving runtime Xor…

jaffe-fly updated 1 month ago

mlcommons/inference #1866

retinanet run harness fails 'executionContext.cpp::setOptimi…

Trying to run offline retinanet in a container with one Nvidia GPU: cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1-dev --model=retinanet --implementation=nvidia …

stbailey001 updated 1 day ago

1000+ results for inference-engines

1000+ results
for inference-engines