llm-ops Search Results - Githubissues

891 results
for llm-ops

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #5698

[REQUEST]win10 install fail build_win.bat

when I run build_win.bat, I get a deepspeed whl file finnally. The matter seems to have been resolved. However, when I ran the program, the following issue occurred File "D:\anaconda3\envs\llm\li…

shark-xiake updated 3 weeks ago
2
llm-jp/llm-jp-eval #124

DeepSpeed-FastGenによるオフライン推論処理の実装

https://github.com/llm-jp/llm-jp-eval/pull/115 の `offline_inference_example.py` を参考にFastGenでオフライン推論処理を実装する。

hiroshi-matsuda-rit updated 4 days ago
11
NVIDIA/TensorRT-LLM #1980

Error: MOE-FP8 quantize Integer divide-by-zero in H20 (llam…

> closed, confirmed that it was fixed in 0.11.0.dev2024060400 Hi @hijkzzz，I meet the same problem for MOE (including 8x22B and 8x7B) fp8 quantization in H20, even after upgrading to 0…

joerong666 updated 17 hours ago
2
PaddlePaddle/Paddle #65565

源码编译cmake时遇到Something Wrong here, this backward op (global_g…

### 问题描述 Issue Description 源码编译过程中使用命令`time cmake .. -DPY_VERSION=3.10 -DWITH_GPU=ON -DWITH_TESTING=ON`遇到如下问题： ```bash -- commit: f8a40a7d3e -- branch: develop /home/sun/anaconda3/envs/paddle-d…

lishuai-97 updated 2 weeks ago
9
usyd-fsalab/fp6_llm #10

Does the repo provide a quantization kernel?

It seems that the fp6_llm repo only includes the kernel `weight_matrix_dequant_fp_eXmY_cpu`, which dequantizes fp6 data to fp16 format, but it lacks the kernel to quantize fp16 data to fp6. Could you …

yatorho updated 14 hours ago
4
NVIDIA/TensorRT-LLM #1833

Failed to run convert_checkpoint.py with int8 weight-only qu…

### System Info CPU Architecture: x86_64 CPU/Host memory size: 1024Gi (1.0Ti) GPU properties: GPU name: NVIDIA GeForce RTX 4090 GPU mem size: 24Gb…

frontword updated 2 weeks ago
10
opendatalab/PDF-Extract-Kit #38

Cannot run on Mac M-chip

Errors as the following: (.venv) (base) pengxiong@PENGMacPro PDF-Extract-Kit % python pdf_extract.py --pdf demo/demo1.pdf [2024-07-19 20:17:51,713] [ ERROR] check_version.py:39 - Error fetching …

bookandlover updated 2 days ago
10
vllm-project/vllm #6421

[Bug]: When using qwen-32b-chat-awq with multi-threaded acce…

### Your current environment ```text Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A ROCM used to build PyTorch: N/A OS: Ubuntu …

ZHJ19970917 updated 1 week ago
1
mit-han-lab/TinyChatEngine #71

Assistant spitting out non-readable characters on RTX 4060

``` (TinyChatEngine) zhef@zhef:~/TinyChatEngine/llm$ make chat -j CUDA is available! src/Generate.cc src/LLaMATokenizer.cc src/OPTGenerate.cc src/OPTTokenizer.cc src/utils.cc src/nn_modules/Fp32OPT…

zhefciad updated 6 days ago
2
vllm-project/vllm #6478

[Bug]: AttributeError: '_OpNamespace' '_C' object has no att…

### Your current environment ```text Versions of relevant libraries: [pip3] flashinfer==0.0.9+cu121torch2.3 [pip3] numpy==1.26.4 [pip3] nvidia-nccl-cu12==2.20.5 [pip3] sentence-transformers==3.0…

choco9966 updated 3 days ago
12

上一页 1...1 2 3 4 5 6 7...90 下一页

891 results for llm-ops

891 results
for llm-ops