inference-acceleration Search Results

1000+ results
for inference-acceleration

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

xurui/SiamRPNTracker #3

accelerate inference

Hello, I want to know whether it accelerate the inference. Recently，I try to accelerate the inference of siamrpn. I try to use fp16 instead of fp32. It is said that fp16 is twice as fast as fp32. It…

JensenHJS updated 4 years ago
9
amazon-science/RAGChecker #11

Error Loading RagChecker

Setup Python version 3.11 Windows Machine pip install ragchecker python -m spacy download en_core_web_sm Its seems like there is trouble connecting with Azure OpenAI or utilising it. I used the…

ss8319 updated 3 weeks ago
9
NVIDIA/TensorRT #4105

performance of concurrent with different module.

## Description I have two different module and convert to trt. when I run them in Serial. the cost time of only infer: ``` //10 times do_infer >> cost 400.60 msec. //warn-up do_infer >> cost 42.22 …

LightSun updated 1 month ago
7
vllm-project/vllm #2957

I use vllm to accelerate the large model of qwen, mainly qwe…

I use vllm to accelerate the large model of qwen, mainly qwen7B/qwen14B. Two issues were found during the testing of the large model. 1) Compared to using vllm qwen7B/qwen14B acceleration, the …

pftzzg updated 7 months ago
14
Xilinx/Vitis-AI #1382

Enhancement Request: Support for Additional AMD Platforms in…

Dear Vitis AI Team, I am writing to express my appreciation for the comprehensive suite of tools and resources that Vitis AI provides. The integration of optimized IP, tools, libraries, and models …

yihong1120 updated 10 months ago
1
google/yggdrasil-decision-forests #122

[Feature Request] GPU Acceleration

Hi There, it's unclear if Yggdrasil supports GPU or TPU acceleration. It seems like if you do fine tuning in JAX maybe it's possible when the model is converted to a JAX function? But it's not clear i…

ZeroCool2u updated 2 months ago
2
immich-app/immich #12930

Getting error in ASGI failed to allocate memory

### The bug I am just looking at my logs because of an issue I am having with facial recognition, these errors are unrelated as they happened during the night, but I wanted to draw some attention to …

rayzorben updated 3 weeks ago
5
microsoft/onnxruntime #15534

[Performance] FP16 model can not get acceleration on GPU wit…

### Describe the issue Hello, I use the float16 tool to convert the FP32 model to the FP16 model and use ONNXRuntime-GPU 1.13.1 to inference. I found that many models cannot obtain inference acce…

yeliang2258 updated 1 year ago
3
unslothai/unsloth #484

How to inference with the converted GGUF using llama-cpp?

I would appreciate if anyone can help with the following problem when using the converted GGUF for inference. I found that inferencing with llama-cpp generates a different result from inferencing …

mk0223 updated 5 months ago
1
microsoft/nnfusion #65

[Story] Optimization on Bert

Feasibility: The NNFusion project need some flag models to prove the usability, we choose Bert as one of the models. Target: 1. Improve NNFusion's inference effectiveness on Transformer/Bert; 2…

wenxcs updated 4 years ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for inference-acceleration

1000+ results
for inference-acceleration