inference-engine Search Results

1000+ results
for inference-engine

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #1938

problem with tensorrt_llm performance

### System Info hi, i generated the tensorrt llm engine for a llama based model and see that the performance is much worse than vllm. i did the following: - compile model with tensorrt llm c…

Arnold1 updated 4 days ago
5
microsoft/DeepSpeed #4261

[Question] codellama 34b inference request.

**Is your feature request related to a problem? Please describe.** NAN **Describe the solution you'd like** NAN **Describe alternatives you've considered** NAN **Additional context** NAN …

nhsjgczryf updated 11 months ago
3
marcoslucianops/DeepStream-Yolo #252

INT8 Calibration has error on Jetson AGX Orin Developer Kit

Hardware: Jetson AGX Orin Developer Kit Software: JetPack 5.0.1 DP What works: Inference works well with FP32 Issue: Inference does not work with INT8. The following output log can be see…

lakshanthad updated 10 months ago
3
NVIDIA/TensorRT-LLM #1255

Can not reach the Throughput value which described in your…

# Background: in the performance doc [https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance.md](url) mentioned: LLama7B , FP16 , batchsize:256 , input_len:128 output_len:128…

felixslu updated 4 days ago
5
tinygrad/tinygrad #7704

AMD unsupported Arch: gfx096

Hi, I've been upsetting tinygrad a bit. I started it with (as far as I recall) ``` AMD=1 ROCM=1 exo --inference-engine=tinygrad ``` I tried to find a workaround using ZLUDA (using LD_LIBRA…

FlorianHeigl updated 1 day ago
1
vllm-project/vllm #8101

[Bug]: ValueError: Queue <multiprocessing.queues.Queue objec…

### Your current environment my device: cuda 11.8 vllm version: vllm-0.5.5 torch is suitable with cuda and vllm python 3.10 ### 🐛 Describe the bug My env is ready, **Only!!!!!** it wo…

Jiangchenglin521 updated 2 weeks ago
1
mlcommons/inference_results_v0.5 #18

openvino-windows: Raise error LNK2019: unresolved external s…

Clone source code from below link: https://github.com/mlperf/inference_results_v0.5/tree/master/closed/Intel/code/ssd-small/openvino-windows List LNK2019: unresolved external symbol too much on th…

JustinInAI updated 4 years ago
10
microsoft/DeepSpeed #1702

[REQUEST] Code sample to use DeepSpeed inference without hav…

**Is your feature request related to a problem? Please describe.** The current examples for DeepSpeed inference uses cmd line 'deepspeed' that internally uses launcher modules of deepspeed to initial…

dhawalkp updated 4 months ago
4
NVIDIA-AI-IOT/Lidar_AI_Solution #197

how to use DLA and GPU to accelerate？

![图片](https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution/assets/49723499/b00064c0-70f6-449a-8884-66b7cbdfc842) 1. hello,the picture show inference use GPU+DLA，but i donot find where to use DLA？ …

leslie27ch updated 1 year ago
2
microsoft/DeepSpeed #4469

[BUG] container dose

**Describe the bug** Describe the bug In DeepSpeed-Chat step3, a runtime error: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 will be thrown when inferenc…

hxdtest updated 9 months ago
5

上一页 1...90 91 92 93 94 95 96...100 下一页

1000+ results for inference-engine

1000+ results
for inference-engine