serving-tensors Search Results

1000+ results
for serving-tensors

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

qqwweee/keras-yolo3 #344

Do anyone tried to convert trained .h5 model to Tensorflow L…

Do anyone tried to convert trained .h5 model to Tensorflow Lite or Tensorflow Serving? I got some difficulties due to some python functions such as 'yolo_loss' and 'yolo_head'.

marco8chong updated 3 years ago
29
meta-llama/llama #971

Llama-2-70b-chat-hf get worse result than Llama-2-70B-Chat-G…

I am trying to use Llama-2-70b-chat-hf as zero-shot text classifier for my datasets. Here is my setups. 1. vLLM + Llama-2-70b-chat-hf I used vLLM as my inference engine as run it with: ``` pyt…

fancyerii updated 10 months ago
4
vllm-project/vllm #5687

[Bug]: Illegal memory access

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC ve…

w013nad updated 2 months ago
6
vllm-project/vllm #8180

[Usage]: "RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAIL…

### Your current environment PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: **** Group Enterprise Linux Server 7.2 (Pala…

xyionwu updated 1 month ago
1
Irwin-Liu/hfnet-tf2onnx #1

Resampler is not supported

Hi! I am trying to convert HFNet to ONNX, then convert it to TensorRT, I found when running ``` python frozen2onnx.py ``` It shows ``` Tensorflow op [pred/descriptor_sampling/resampler/Resampl…

xuhao1 updated 1 year ago
13
BinitDOX/Manga-Colorizer #14

"Error: CUDA error: an illegal memory access was encountered…

Sometimes the error mentioned in the title occurs, after which the server stops processing images and keeps giving this error. If the server is manually restarted, it starts working correctly, conti…

iG8R updated 2 months ago
19
intel-analytics/ipex-llm #12093

Error connection when testing fastchat case

Reference：https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/fastchat_quickstart.md While testing Fastchat with below steps, I got connectin error as below attachment pic…

guzhangbo1989 updated 1 month ago
1
vllm-project/vllm #8270

[Usage]: Distributed inference with edge case: model fits mu…

### Your current environment ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

leszekhanusz updated 1 month ago
7
intel-analytics/ipex-llm #12120

cant run ollama using llm-cpp on 12th igpu under linux

#####i start serving with this script : #####bash export OLLAMA_NUM_GPU=999 export no_proxy=localhost,127.0.0.1 export ZES_ENABLE_SYSMAN=1 source /opt/intel/oneapi/setvars.sh ./ollama serve …

user7z updated 1 week ago
16
alipay/PainlessInferenceAcceleration #3

How the performance VS vLLM inference（vLLM vs Lookahead）

In the benchmark comparison results, could we add a comparison with VLLM to see the acceleration effects?

buptygz updated 6 months ago
5

上一页 1...15 16 17 18 19 20 21...100 下一页

1000+ results for serving-tensors

1000+ results
for serving-tensors