inference-engines Search Results

1000+ results
for inference-engines

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

f4pga/ideas #19

Lowlevel LLVM-like language as HDL middle layer

Basic idea is to create something low-level, like LLVM Bitcode or WebAssembly, to create the HDL compilers emit the code in this format, which will be fed to the routers/synthesizers after. This will …

XVilka updated 2 years ago
105
NVIDIA/TensorRT-LLM #269

Any example for the input of `./benchmarks/cpp/prepare_datas…

I want to run the `benchmarks/gptManagerBenchmark`, seems a file is used to generate input_ids. Is there an example for this?

sleepwalker2017 updated 10 months ago
10
triton-inference-server/tensorrtllm_backend #189

Concurrent request very slow

I am testing trtllm backend v0.6.0 for llama2-7b with below setup. The code snippets is as below. If I set one request with length being about 1000, it took about 2-3 seconds to finish. And if I send…

flexwang updated 9 months ago
11
triton-inference-server/server #6576

segmentation fault on /generate_stream when the client close…

If the client tries to close the connection when the server is still generating, the server will crash on segmentation fault. 100% reproducible.

anxietymonger updated 10 months ago
8
huggingface/tokenizers #63

JS / WebAssembly binding planned ?

I see your Node.js binding using Neon. But have you considered WebAssembly ? There are some tools to compile Rust code easily. So you will get a browser compatiblity and node v13 with a low impact on …

mikbry updated 3 months ago
27
kg-construct/rml-core #76

Joins in RML

How joins are currently handled: We refer to a Parent Triples Map but in fact we use the Subject Map of the Triples Map. ``` rml:logicalSource ; rml:subjectMap ; rml:predica…

andimou updated 3 months ago
26
litestar-org/litestar #2449

Enhancement: Infer Jinja files as text/html

### Summary The `media_type` is not inferred correctly when rendering a Jinja template with file extensions other than `.html` such as `.jinja`, `.jinja2`, `.j2`. So instead of passing a `media_ty…

ADV1K updated 10 months ago
4
triton-inference-server/server #6548

Signal (11) received, when using tensorRT-LLM backend to dep…

**Description** I used triton inference server with trt-llm backend to deploy Baichuan2, but got errors when sending requests. **Triton Information** 23.10-trtllm-python-py3 Are you using the …

Kevinddddddd updated 8 months ago
4
NVIDIA/TensorRT-LLM #247

when max_batch_size=1,64,128 in build phrase, the gpu memory…

I need batch inference, so i set different max_batch_size, like 1, 64, 128. Then I found the gpu memory use will be 27g, 49g, 72g in inference phrase. so i need at least 72g gpu memory to inference wh…

HoursAI updated 10 months ago
9
janhq/jan #754

feat: Threads can override and inherit default params from A…

## Tasks - [x] Eng Spec @dan-jan - [ ] Local Models - [ ] Remote Models (KIV) - [x] janhq/internal#66 - Functionality - [x] Each Recommended Model should have default values defin…

dan-homebrew updated 9 months ago
12

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for inference-engines

1000+ results
for inference-engines