Closed xiezhipeng-git closed 7 hours ago
@DarkLight1337
We don't officially support Windows. Can you run this with a Linux file path?
model='D:\Users\Admin\.cache\modelscope\hub\qwen\Qwen-QwQ-32B-Preview-GGUF\QwQ-32B-Preview-IQ2_XS.gguf'
If you are running vLLM on WSL, to access the model on your windows file system, you should use /mnt/d/...
, so I guess the model path should be '/mnt/d/Users/Admin/.cache/modelscope/hub/qwen/Qwen-QwQ-32B-Preview-GGUF/QwQ-32B-Preview-IQ2_XS.gguf'
model='D:\Users\Admin.cache\modelscope\hub\qwen\Qwen-QwQ-32B-Preview-GGUF\QwQ-32B-Preview-IQ2_XS.gguf'model='D:\Users\Admin.cache\modelscope\hub\qwen\Qwen-QwQ-32B-Preview-GGUF\QwQ-32B-Preview-IQ2_XS.gguf'
If you are running vLLM on WSL, to access the model on your windows file system, you should use
/mnt/d/...
, so I guess the model path should be'/mnt/d/Users/Admin/.cache/modelscope/hub/qwen/Qwen-QwQ-32B-Preview-GGUF/QwQ-32B-Preview-IQ2_XS.gguf'
如果你在WSL上运行vLLM,要访问你的Windows文件系统上的模型,你应该使用/mnt/d/.
,所以我猜模型路径应该是'/mnt/d/Users/Admin/.cache/modelscope/hub/qwen/Qwen-QwQ-32B-Preview-GGUF/QwQ-32B-Preview-IQ2_XS.gguf'
you are right. I forget that. I‘m sorry.
Your current environment
The output of `python collect_env.py`
```text aimo2/collect_env.py Collecting environment information... PyTorch version: 2.5.1+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A OS: Ubuntu 24.04.1 LTS (x86_64) GCC version: (Ubuntu 13.2.0-23ubuntu4) 13.2.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.39 Python version: 3.12.7 | packaged by Anaconda, Inc. | (main, Oct 4 2024, 13:27:36) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.39 Is CUDA available: True CUDA runtime version: 12.6.77 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090 Nvidia driver version: 560.94 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.5.1 /usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.5.1 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i9-13900KS CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 1 BogoMIPS: 6374.40 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities Virtualization: VT-x Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 768 KiB (16 instances) L1i cache: 512 KiB (16 instances) L2 cache: 32 MiB (16 instances) L3 cache: 36 MiB (1 instance) Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Mitigation; Enhanced IBRS Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Versions of relevant libraries: [pip3] flake8==7.0.0 [pip3] mypy==1.11.2 [pip3] mypy-extensions==1.0.0 [pip3] numpy==1.26.4 [pip3] numpydoc==1.7.0 [pip3] nvidia-cublas-cu12==12.4.5.8 [pip3] nvidia-cuda-cupti-cu12==12.4.127 [pip3] nvidia-cuda-nvrtc-cu12==12.4.127 [pip3] nvidia-cuda-runtime-cu12==12.4.127 [pip3] nvidia-cudnn-cu12==9.1.0.70 [pip3] nvidia-cufft-cu12==11.2.1.3 [pip3] nvidia-curand-cu12==10.3.5.147 [pip3] nvidia-cusolver-cu12==11.6.1.9 [pip3] nvidia-cusparse-cu12==12.3.1.170 [pip3] nvidia-ml-py==12.560.30 [pip3] nvidia-nccl-cu12==2.21.5 [pip3] nvidia-nvjitlink-cu12==12.4.127 [pip3] nvidia-nvtx-cu12==12.4.127 [pip3] pyzmq==25.1.2 [pip3] torch==2.5.1 [pip3] torchaudio==2.4.0 [pip3] torchvision==0.20.1 [pip3] transformers==4.46.1 [pip3] triton==3.1.0 [conda] _anaconda_depends 2024.10 py312_mkl_0 [conda] blas 1.0 mkl [conda] cuda-cudart 12.4.127 0 nvidia [conda] cuda-cupti 12.4.127 0 nvidia [conda] cuda-libraries 12.4.1 0 nvidia [conda] cuda-nvrtc 12.4.127 0 nvidia [conda] cuda-nvtx 12.4.127 0 nvidia [conda] cuda-opencl 12.6.77 0 nvidia [conda] cuda-runtime 12.4.1 0 nvidia [conda] cuda-version 12.6 3 nvidia [conda] ffmpeg 4.3 hf484d3e_0 pytorch [conda] libcublas 12.4.5.8 0 nvidia [conda] libcufft 11.2.1.3 0 nvidia [conda] libcufile 1.11.1.6 0 nvidia [conda] libcurand 10.3.7.77 0 nvidia [conda] libcusolver 11.6.1.9 0 nvidia [conda] libcusparse 12.3.1.170 0 nvidia [conda] libjpeg-turbo 2.0.0 h9bf148f_0 pytorch [conda] libnpp 12.2.5.30 0 nvidia [conda] libnvfatbin 12.6.77 0 nvidia [conda] libnvjitlink 12.4.127 0 nvidia [conda] libnvjpeg 12.3.1.117 0 nvidia [conda] mkl 2023.1.0 h213fc3f_46344 [conda] mkl-service 2.4.0 py312h5eee18b_1 [conda] mkl_fft 1.3.10 py312h5eee18b_0 [conda] mkl_random 1.2.7 py312h526ad5a_0 [conda] numpy 1.26.4 py312hc5e2394_0 [conda] numpy-base 1.26.4 py312h0da6c21_0 [conda] numpydoc 1.7.0 py312h06a4308_0 [conda] nvidia-cublas-cu12 12.4.5.8 pypi_0 pypi [conda] nvidia-cuda-cupti-cu12 12.4.127 pypi_0 pypi [conda] nvidia-cuda-nvrtc-cu12 12.4.127 pypi_0 pypi [conda] nvidia-cuda-runtime-cu12 12.4.127 pypi_0 pypi [conda] nvidia-cudnn-cu12 9.1.0.70 pypi_0 pypi [conda] nvidia-cufft-cu12 11.2.1.3 pypi_0 pypi [conda] nvidia-curand-cu12 10.3.5.147 pypi_0 pypi [conda] nvidia-cusolver-cu12 11.6.1.9 pypi_0 pypi [conda] nvidia-cusparse-cu12 12.3.1.170 pypi_0 pypi [conda] nvidia-ml-py 12.560.30 pypi_0 pypi [conda] nvidia-nccl-cu12 2.21.5 pypi_0 pypi [conda] nvidia-nvjitlink-cu12 12.4.127 pypi_0 pypi [conda] nvidia-nvtx-cu12 12.4.127 pypi_0 pypi [conda] pytorch-cuda 12.4 hc786d27_7 pytorch [conda] pytorch-mutex 1.0 cuda pytorch [conda] pyzmq 25.1.2 py312h6a678d5_0 [conda] torch 2.5.1 pypi_0 pypi [conda] torchaudio 2.4.0 py312_cu124 pytorch [conda] torchvision 0.20.1 pypi_0 pypi [conda] transformers 4.46.1 pypi_0 pypi [conda] triton 3.0.0 pypi_0 pypi ROCM Version: Could not collect Neuron SDK Version: N/A vLLM Version: 0.6.4.post1 vLLM Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: GPU0 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X N/A Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks LD_LIBRARY_PATH=/root/anaconda3/lib/python3.12/site-packages/cv2/../../lib64:/usr/local/cuda-12.6/lib64:/usr/local/cuda-12.6/lib64 MKL_THREADING_LAYER=GNU MKL_SERVICE_FORCE_INTEL=1 CUDA_MODULE_LOADING=LAZY ```Model Input Dumps
🐛 Describe the bug
use Qwen-QwQ-32B-Preview vllm 0.6.4-post1 openai api and gguf kaggle awq model api or local all has this issue. And full size other error https://www.modelscope.cn/models/Qwen/QwQ-32B-Preview/feedback/issueDetail/18860 In kaggle not 0.6.3.post1 awq is run success
Before submitting a new issue...