inference-engine Search Results

1000+ results
for inference-engine

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

exo-explore/exo #155

Having trouble running on ubuntu linux with 4090 (cuda 12.2)

Tried tensorflow and torch with tinygrad still getting this error with llama 8b 3.1 and llama 8b as well. Apparently this is an opencl compile error for bfloat16 data type Sorry, I am not a kernel …

ctisme updated 1 month ago
6
mindsdb/mindsdb #9640

[Integration]: Add Support for Modal Inference and Fine-tuni…

### Is there an existing integration? - [x] I have searched the existing integrations. ### Use Case This feature would allow users to seamlessly integrate Modal's infrastructure for both inference …

mahzy updated 2 months ago
2
exo-explore/exo #204

tinygrad inference engine fails with BEAM=1 due to not runni…

This only happens with `BEAM=1`. `BEAM=0`, `BEAM=2`, `BEAM=3` all work fine This happens because exo runs tinygrad inference on another thread. Example command to reproduce: `DEBUG=6 BEAM=1 python3 …

AlexCheema updated 2 months ago
1
ultralytics/ultralytics #14468

Raspberry Pi 4B NCNN模型推理时报错

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### YOLOv8 Component _No response_ ### Bug …

JWLe666 updated 3 weeks ago
4
vllm-project/vllm #9274

[Usage]: Manually Increasing inference time

### Your current environment ```text The output of `python collect_env.py` ``` ### How would you like to use vllm I am currently running Qwen2.5-72b-instruct on a DGX PCIE server with VLLM as t…

Playerrrrr updated 1 month ago
1
NVIDIA/TensorRT-LLM #2441

Regarding server performance with LoRA

### System Info GPU: Nvidia H100 Model: Llama3 8B ### Who can help? @kaiyux ### Information - [x] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially suppo…

binhtranmcs updated 6 days ago
7
exo-explore/exo #411

How to use multiple GPU from a node

Hi, I have a multi node setup with multiple GPU. I was able to get the cluster but I don't see the remaining GPU's from each nodes. How do I do that. Also observed below error while using llama…

udupicloud updated 6 days ago
10
vllm-project/vllm #3066

Recovery from OOM

I am instantiating an LLM class for local inference. I noticed that when an OOM error happens in `vllm.LLM.llm_engine.step()` and I capture it, previous requests are not aborted and would mess up with…

Ja1Zhou updated 4 days ago
3
NVIDIA/TensorRT-LLM #2371

How to integrate Multi-LoRA Setup at Inference with NVIDIA T…

I built the engine, and had two separate LoRA layers with the base llama3.1 model. The output from the build is rank0.engine, config.json, and then a lora folder with the following structure: lora | |…

JoJoLev updated 6 days ago
10
SciML/DiffEqBayes.jl #192

Change ABC inference engine?

Since, [ApproxBayes.jl](https://github.com/marcjwilliams1/ApproxBayes.jl) was integrated here there are a few more ABC engines available that are more fully-featured and likely better supported at pre…

marcjwilliams1 updated 2 years ago
2

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for inference-engine

1000+ results
for inference-engine