llm-ops Search Results - Githubissues

1000+ results
for llm-ops

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-Model-Optimizer #13

Error Converting checkpoint after INT4AWQ quantization

Hi. I ran the command: ```bash export HF_PATH="mistralai/Mixtral-8x7B-Instruct-v0.1" scripts/huggingface_example.sh --type llama --model $HF_PATH --quant int4_awq --tp 4 ``` On a Node with 8 …

christian-ci updated 5 months ago
7
UKGovernmentBEIS/inspect_ai #156

Unable to expose `inspect view` with ngrok due to random res…

I'm running inspect log viewer locally with `inspect view` where the viewer works well on `localhost:7575` but I see strange failures when I export it with ngrok (`ngrok http 7575`) ``` GET https:…

vcidst updated 3 months ago
11
ml-explore/mlx #963

QMM, QMV, and QVM using FP32 for Mul and Acc even when runni…

Question regarding QMM, QMV, and QVM kernels. When using the Metal Debugger on them I noticed that no matter the data type used for the operation itself (e.g. float16) all the simdgroup_matrix operati…

arpan-dhatt updated 2 months ago
2
xusenlinzy/api-for-open-llm #279

vllm模式推理报错

### 问题类型 | Type of problem 模型推理和部署 | Model inference and deployment ### 操作系统 | Operating system Linux ### 详细描述问题 | Detailed description of the problem 使用主分支最新代码。Docker部署，安装依赖。推理Qwen-1…

yeehua-cn updated 5 months ago
2
runpod-workers/worker-vllm #68

Runpod serverless vLLM with Llama 3 70B on 40GB GPU

Im running a runpod serverless vLLM template with Llama 3 70B on 40GB GPU. One of the requests failed and I'm not completely sure what happened but the message asked me to open a github issue so I'll …

EdwardTheLegend updated 4 months ago
8
PaddlePaddle/PaddleNLP #7666

[Bug]: 使用 PaddleNLP/llm/inference.md 安装推理优化，运行时报错 cannot im…

### 软件环境 ```Markdown - paddlepaddle: 0.0.0 - paddlepaddle-gpu: 0.0.0.post118 - paddlenlp: 2.6.1 ``` ### 重复问题 - [X] I have searched the existing issues ### 错误描述 ```Markdown PredictorArgument(mo…

Eric-github1 updated 3 months ago
5
vllm-project/vllm #6103

[Bug]: fused_moe_kernel compile bug

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

jeejeelee updated 3 months ago
13
vllm-project/vllm #3784

vllm-0.4.0.post1+neuron213; ModuleNotFoundError: No module n…

### Your current environment Collecting environment information... WARNING 04-02 01:12:23 ray_utils.py:70] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For distributed inf…

MojHnd updated 3 months ago
7
intel-analytics/ipex-llm #10763

deepspeed_optimize_model_gpu Qwen/Qwen-7B-Chat

ipex-llm + deepspeed run Qwen-7B-Chat with the following error: [0] RuntimeError: shape '[1, 1024, 16, 128]' is invalid for input of size 4194304 accelerate 0.29.2 mpi4py …

kevin-t-tang updated 4 months ago
1
pytorch/pytorch #59246

Support complex type in ONNX export

## 🐛 Bug When exporting a model with `torch.onnx.export` I receive the following error ``` File "/venv/lib/python3.8/site-packages/torch/onnx/utils.py", line 709, in _export proto, export_ma…

david-macleod updated 2 months ago
17

上一页 1...74 75 76 77 78 79 80...100 下一页

1000+ results for llm-ops

1000+ results
for llm-ops