[Installation]: ERROR: Could not find a version that satisfies the requirement pyzmq (from versions: none)

daidaiershidi commented 3 months ago

Your current environment

Collecting environment information...
WARNING 07-16 03:53:58 _custom_ops.py:14] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
/disk3/renyanfu/liukaiyuan/project/opencompass/vllm/vllm/usage/usage_lib.py:19: RuntimeWarning: Failed to read commit hash:
No module named 'vllm.commit_id'
  from vllm.version import __version__ as VLLM_VERSION
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.30.0
Libc version: glibc-2.31

Python version: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-3.10.0-1127.19.1.el7.x86_64-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA A100-SXM4-40GB
GPU 1: NVIDIA A100-SXM4-40GB
GPU 2: NVIDIA A100-SXM4-40GB
GPU 3: NVIDIA A100-SXM4-40GB

Nvidia driver version: 525.60.13
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.0
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          52
On-line CPU(s) list:             0-51
Thread(s) per core:              2
Core(s) per socket:              26
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
Stepping:                        7
CPU MHz:                         2499.996
BogoMIPS:                        4999.99
Hypervisor vendor:               KVM
Virtualization type:             full
L1d cache:                       832 KiB
L1i cache:                       832 KiB
L2 cache:                        26 MiB
L3 cache:                        35.8 MiB
NUMA node0 CPU(s):               0-51
Vulnerability Itlb multihit:     Processor vulnerable
Vulnerability L1tf:              Mitigation; PTE Inversion
Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; Load fences, usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Vulnerable: Retpoline without IBPB
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat avx512_vnni

Versions of relevant libraries:
[pip3] numpy==1.26.0
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] sentence-transformers==2.2.2
[pip3] torch==2.3.1
[pip3] torchaudio==2.1.0
[pip3] torchelastic==0.2.2
[pip3] torchvision==0.18.1
[pip3] transformers==4.42.4
[pip3] triton==2.3.1
[conda] blas                      1.0                         mkl  
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] libjpeg-turbo             2.0.0                h9bf148f_0    pytorch
[conda] mkl                       2023.1.0         h213fc3f_46343  
[conda] mkl-service               2.4.0           py310h5eee18b_1  
[conda] mkl_fft                   1.3.8           py310h5eee18b_0  
[conda] mkl_random                1.2.4           py310hdb19cb5_0  
[conda] numpy                     1.26.0          py310h5f9d8c6_0  
[conda] numpy-base                1.26.0          py310hb5e798b_0  
[conda] nvidia-nccl-cu12          2.20.5                   pypi_0    pypi
[conda] pytorch-cuda              12.1                 ha16c6d3_5    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] sentence-transformers     2.2.2                    pypi_0    pypi
[conda] torch                     2.3.1                    pypi_0    pypi
[conda] torchaudio                2.1.0               py310_cu121    pytorch
[conda] torchelastic              0.2.2                    pypi_0    pypi
[conda] torchvision               0.18.1                   pypi_0    pypi
[conda] transformers              4.42.4                   pypi_0    pypi
[conda] triton                    2.3.1                    pypi_0    pypi
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.5.2
vLLM Build Flags:
CUDA Archs: ; ROCm: Disabled; Neuron: Disabled
GPU Topology:
GPU0    GPU1    GPU2    GPU3    CPU Affinity    NUMA Affinity
GPU0     X      NV12    NV12    NV12    0-51            N/A
GPU1    NV12     X      NV12    NV12    0-51            N/A
GPU2    NV12    NV12     X      NV12    0-51            N/A
GPU3    NV12    NV12    NV12     X      0-51            N/A

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

How you are installing vllm

docker build -f Dockerfile -t vllm/pytorch:3.0.x-cuda12.4 .

youkaichao commented 3 months ago

which commit do you use to build?

daidaiershidi commented 3 months ago

docker build -f Dockerfile -t vllm/pytorch:3.0.x-cuda12.4 .

It's a problem with our network. I added pip source after pip command.

Can I please not install the requirements-mamba.txt, I won't use mamba and I will get stuck at pip install mamba

youkaichao commented 3 months ago

cannot provide any help on docker build. it is a complicated process. you can either use the official docker image https://hub.docker.com/r/vllm/vllm-openai/tags , or build from source directly, following https://docs.vllm.ai/en/latest/getting_started/installation.html .

daidaiershidi commented 3 months ago

cannot provide any help on docker build. it is a complicated process. you can either use the official docker image https://hub.docker.com/r/vllm/vllm-openai/tags , or build from source directly, following https://docs.vllm.ai/en/latest/getting_started/installation.html .

hi, this is my command:

docker build -f NewDockerfile -t vllm/pytorch:3.0.x-cuda12.4 .

And my Dockerfile is:

# The vLLM Dockerfile is used to construct vLLM image that can be directly used
# to run the OpenAI compatible server.

# Please update any changes made here to
# docs/source/dev/dockerfile/dockerfile.rst and
# docs/source/assets/dev/dockerfile-stages-dependency.png

# run
# docker build -f NewDockerfile -t vllm/pytorch:3.0.x-cuda12.4 .
# run

ARG CUDA_VERSION=12.4.1
#################### BASE BUILD IMAGE ####################
# prepare basic build environment
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04 AS base

ARG CUDA_VERSION=12.4.1
ARG PYTHON_VERSION=3

ENV DEBIAN_FRONTEND=noninteractive

RUN echo 'tzdata tzdata/Areas select America' | debconf-set-selections \
    && echo 'tzdata tzdata/Zones/America select Los_Angeles' | debconf-set-selections \
    && apt-get update -y \
    && apt-get install -y ccache software-properties-common \
    && add-apt-repository ppa:deadsnakes/ppa \
    && apt-get update -y \
    && apt-get install -y python${PYTHON_VERSION} python${PYTHON_VERSION}-dev python${PYTHON_VERSION}-venv python3-pip \
    && if [ "${PYTHON_VERSION}" != "3" ]; then update-alternatives --install /usr/bin/python3 python3 /usr/bin/python${PYTHON_VERSION} 1; fi \
    && python3 --version \
    && python3 -m pip --version

RUN apt-get update -y \
    && apt-get install -y python3-pip git curl sudo

# Workaround for https://github.com/openai/triton/issues/2507 and
# https://github.com/pytorch/pytorch/issues/107960 -- hopefully
# this won't be needed for future versions of this docker image
# or future versions of triton.
RUN ldconfig /usr/local/cuda-$(echo $CUDA_VERSION | cut -d. -f1,2)/compat/

WORKDIR /workspace

# install build and runtime dependencies
COPY requirements-common.txt requirements-common.txt
COPY requirements-cuda.txt requirements-cuda.txt
RUN --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install -r requirements-cuda.txt -i https://pypi.tuna.tsinghua.edu.cn/simple  

# 
COPY requirements-mamba.txt requirements-mamba.txt
# RUN pip install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple  
# RUN python3 -m pip install packaging -i https://pypi.tuna.tsinghua.edu.cn/simple 
# RUN pip install causal-conv1d>=1.2.0   
# RUN pip install mamba-ssm>=1.2.2  
# RUN python3 -m pip install -r requirements-mamba.txt -i https://pypi.tuna.tsinghua.edu.cn/simple 
# pip install -r requirements-mamba.txt 有问题

#################### 修改部分 ####################
COPY requirements-mamba.txt requirements-mamba.txt
RUN pip install --upgrade pip setuptools wheel -i https://pypi.tuna.tsinghua.edu.cn/simple  
RUN apt update && apt-get install -y --no-install-recommends \
    build-essential \
    vim \
    git \
    wget \
    tmux \
    python3-dev \
    libatlas-base-dev \
    && rm -rf /var/lib/apt/lists/*
RUN ln -s /usr/bin/python3 /usr/bin/python

COPY mamba_ssm-2.2.2.tar.gz mamba_ssm-2.2.2.tar.gz
# RUN pip install --no-index mamba_ssm-1.2.2.tar.gz
RUN tar -xvf mamba_ssm-2.2.2.tar.gz && cd mamba_ssm-2.2.2 && \
    python setup.py build && \
    python setup.py install
COPY causal_conv1d-1.2.1.tar.gz causal_conv1d-1.2.1.tar.gz
RUN tar -xvf causal_conv1d-1.2.1.tar.gz && cd causal_conv1d-1.2.1 && \
    python setup.py build && \
    python setup.py install
#################### 修改部分 ####################

# cuda arch list used by torch
# can be useful for both `dev` and `test`
# explicitly set the list to avoid issues with torch 2.2
# see https://github.com/pytorch/pytorch/pull/123243
ARG torch_cuda_arch_list='7.0 7.5 8.0 8.6 8.9 9.0+PTX'
ENV TORCH_CUDA_ARCH_LIST=${torch_cuda_arch_list}
#################### BASE BUILD IMAGE ####################

#################### WHEEL BUILD IMAGE ####################
FROM base AS build

ARG PYTHON_VERSION=3

# install build dependencies
COPY requirements-build.txt requirements-build.txt

RUN --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install -r requirements-build.txt -i https://pypi.tuna.tsinghua.edu.cn/simple  

# install compiler cache to speed up compilation leveraging local or remote caching
RUN apt-get update -y && apt-get install -y ccache

# files and directories related to build wheels
COPY csrc csrc
COPY setup.py setup.py
COPY cmake cmake
COPY CMakeLists.txt CMakeLists.txt
COPY requirements-common.txt requirements-common.txt
COPY requirements-cuda.txt requirements-cuda.txt
COPY pyproject.toml pyproject.toml
COPY vllm vllm

# max jobs used by Ninja to build extensions
ARG max_jobs=2
ENV MAX_JOBS=${max_jobs}
# number of threads used by nvcc
ARG nvcc_threads=8
ENV NVCC_THREADS=$nvcc_threads
# make sure punica kernels are built (for LoRA)
ENV VLLM_INSTALL_PUNICA_KERNELS=1

ARG buildkite_commit
ENV BUILDKITE_COMMIT=${buildkite_commit}

ARG USE_SCCACHE
# if USE_SCCACHE is set, use sccache to speed up compilation
RUN --mount=type=cache,target=/root/.cache/pip \
    if [ "$USE_SCCACHE" = "1" ]; then \
        echo "Installing sccache..." \
        && curl -L -o sccache.tar.gz https://github.com/mozilla/sccache/releases/download/v0.8.1/sccache-v0.8.1-x86_64-unknown-linux-musl.tar.gz \
        && tar -xzf sccache.tar.gz \
        && sudo mv sccache-v0.8.1-x86_64-unknown-linux-musl/sccache /usr/bin/sccache \
        && rm -rf sccache.tar.gz sccache-v0.8.1-x86_64-unknown-linux-musl \
        && export SCCACHE_BUCKET=vllm-build-sccache \
        && export SCCACHE_REGION=us-west-2 \
        && export CMAKE_BUILD_TYPE=Release \
        && sccache --show-stats \
        && python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38 \
        && sccache --show-stats; \
    fi

ENV CCACHE_DIR=/root/.cache/ccache
RUN --mount=type=cache,target=/root/.cache/ccache \
    --mount=type=cache,target=/root/.cache/pip \
    if [ "$USE_SCCACHE" != "1" ]; then \
        python3 setup.py bdist_wheel --dist-dir=dist --py-limited-api=cp38; \
    fi

# check the size of the wheel, we cannot upload wheels larger than 100MB
COPY .buildkite/check-wheel-size.py check-wheel-size.py
RUN python3 check-wheel-size.py dist

#################### EXTENSION Build IMAGE ####################

#################### DEV IMAGE ####################
FROM base as dev

COPY requirements-lint.txt requirements-lint.txt
COPY requirements-test.txt requirements-test.txt
COPY requirements-dev.txt requirements-dev.txt
RUN --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install -r requirements-dev.txt -i https://pypi.tuna.tsinghua.edu.cn/simple 

#################### DEV IMAGE ####################
#################### MAMBA Build IMAGE ####################
FROM dev as mamba-builder
# max jobs used for build
ARG max_jobs=2
ENV MAX_JOBS=${max_jobs}

WORKDIR /usr/src/mamba

COPY requirements-mamba.txt requirements-mamba.txt

# Download the wheel or build it if a pre-compiled release doesn't exist
RUN pip --verbose wheel -r requirements-mamba.txt \
    --no-build-isolation --no-deps --no-cache-dir -i https://pypi.tuna.tsinghua.edu.cn/simple

#################### MAMBA Build IMAGE ####################

#################### vLLM installation IMAGE ####################
# image with vLLM installed
FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu22.04 AS vllm-base
ARG CUDA_VERSION=12.4.1
WORKDIR /vllm-workspace

RUN apt-get update -y \
    && apt-get install -y python3-pip git vim

# Workaround for https://github.com/openai/triton/issues/2507 and
# https://github.com/pytorch/pytorch/issues/107960 -- hopefully
# this won't be needed for future versions of this docker image
# or future versions of triton.
RUN ldconfig /usr/local/cuda-$(echo $CUDA_VERSION | cut -d. -f1,2)/compat/

# install vllm wheel first, so that torch etc will be installed
RUN --mount=type=bind,from=build,src=/workspace/dist,target=/vllm-workspace/dist \
    --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install dist/*.whl --verbose

RUN --mount=type=bind,from=mamba-builder,src=/usr/src/mamba,target=/usr/src/mamba \
    --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install /usr/src/mamba/*.whl --no-cache-dir

RUN --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install https://github.com/flashinfer-ai/flashinfer/releases/download/v0.0.9/flashinfer-0.0.9+cu121torch2.3-cp310-cp310-linux_x86_64.whl
#################### vLLM installation IMAGE ####################

#################### TEST IMAGE ####################
# image to run unit testing suite
# note that this uses vllm installed by `pip`
FROM vllm-base AS test

ADD . /vllm-workspace/

# install development dependencies (for testing)
RUN --mount=type=cache,target=/root/.cache/pip \
    python3 -m pip install -r requirements-dev.txt -i https://pypi.tuna.tsinghua.edu.cn/simple 

# doc requires source code
# we hide them inside `test_docs/` , so that this source code
# will not be imported by other tests
RUN mkdir test_docs
RUN mv docs test_docs/
RUN mv vllm test_docs/

#################### TEST IMAGE ####################

#################### OPENAI API SERVER ####################
# openai api server alternative
FROM vllm-base AS vllm-openai

# install additional dependencies for openai api server
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install accelerate hf_transfer 'modelscope!=1.15.0' -i https://pypi.tuna.tsinghua.edu.cn/simple

ENV VLLM_USAGE_SOURCE production-docker-image

ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]
#################### OPENAI API SERVER ####################

Output:

[+] Building 4.7s (24/52)                                                                                                                                                                               docker:default
 => [internal] load build definition from NewDockerfile                                                                                                                                                           0.0s
 => => transferring dockerfile: 9.47kB                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.4.1-base-ubuntu22.04                                                                                                                                    0.8s
 => [internal] load metadata for docker.io/nvidia/cuda:12.4.1-devel-ubuntu22.04                                                                                                                                   0.8s
 => [internal] load .dockerignore                                                                                                                                                                                 0.0s
 => => transferring context: 50B                                                                                                                                                                                  0.0s
 => [internal] load build context                                                                                                                                                                                 0.0s
 => => transferring context: 40.51kB                                                                                                                                                                              0.0s
 => [vllm-base 1/7] FROM docker.io/nvidia/cuda:12.4.1-base-ubuntu22.04@sha256:0f6bfcbf267e65123bcc2287e2153dedfc0f24772fb5ce84afe16ac4b2fada95                                                                    0.0s
 => [base  1/17] FROM docker.io/nvidia/cuda:12.4.1-devel-ubuntu22.04@sha256:da6791294b0b04d7e65d87b7451d6f2390b4d36225ab0701ee7dfec5769829f5                                                                      0.0s
 => CACHED [vllm-base 2/7] WORKDIR /vllm-workspace                                                                                                                                                                0.0s
 => CACHED [vllm-base 3/7] RUN apt-get update -y     && apt-get install -y python3-pip git vim                                                                                                                    0.0s
 => CACHED [vllm-base 4/7] RUN ldconfig /usr/local/cuda-$(echo 12.4.1 | cut -d. -f1,2)/compat/                                                                                                                    0.0s
 => CACHED [base  2/17] RUN echo 'tzdata tzdata/Areas select America' | debconf-set-selections     && echo 'tzdata tzdata/Zones/America select Los_Angeles' | debconf-set-selections     && apt-get update -y     0.0s
 => CACHED [base  3/17] RUN apt-get update -y     && apt-get install -y python3-pip git curl sudo                                                                                                                 0.0s
 => CACHED [base  4/17] RUN ldconfig /usr/local/cuda-$(echo 12.4.1 | cut -d. -f1,2)/compat/                                                                                                                       0.0s
 => CACHED [base  5/17] WORKDIR /workspace                                                                                                                                                                        0.0s
 => CACHED [base  6/17] COPY requirements-common.txt requirements-common.txt                                                                                                                                      0.0s
 => CACHED [base  7/17] COPY requirements-cuda.txt requirements-cuda.txt                                                                                                                                          0.0s
 => CACHED [base  8/17] RUN --mount=type=cache,target=/root/.cache/pip     python3 -m pip install -r requirements-cuda.txt -i https://pypi.tuna.tsinghua.edu.cn/simple                                            0.0s
 => CACHED [base  9/17] COPY requirements-mamba.txt requirements-mamba.txt                                                                                                                                        0.0s
 => CACHED [base 10/17] COPY requirements-mamba.txt requirements-mamba.txt                                                                                                                                        0.0s
 => CACHED [base 11/17] RUN pip install --upgrade pip setuptools wheel -i https://pypi.tuna.tsinghua.edu.cn/simple                                                                                                0.0s
 => CACHED [base 12/17] RUN apt update && apt-get install -y --no-install-recommends     build-essential     vim     git     wget     tmux     python3-dev     libatlas-base-dev     && rm -rf /var/lib/apt/list  0.0s
 => CACHED [base 13/17] RUN ln -s /usr/bin/python3 /usr/bin/python                                                                                                                                                0.0s
 => CACHED [base 14/17] COPY mamba_ssm-2.2.2.tar.gz mamba_ssm-2.2.2.tar.gz                                                                                                                                        0.0s
 => ERROR [base 15/17] RUN tar -xvf mamba_ssm-2.2.2.tar.gz && cd mamba_ssm-2.2.2 &&     python setup.py build &&     python setup.py install                                                                      3.8s
------
 > [base 15/17] RUN tar -xvf mamba_ssm-2.2.2.tar.gz && cd mamba_ssm-2.2.2 &&     python setup.py build &&     python setup.py install:
0.162 mamba_ssm-2.2.2/
0.163 mamba_ssm-2.2.2/AUTHORS
0.163 mamba_ssm-2.2.2/LICENSE
0.164 mamba_ssm-2.2.2/PKG-INFO
0.164 mamba_ssm-2.2.2/README.md
0.164 mamba_ssm-2.2.2/mamba_ssm/
0.165 mamba_ssm-2.2.2/mamba_ssm/__init__.py
0.165 mamba_ssm-2.2.2/mamba_ssm/distributed/
0.165 mamba_ssm-2.2.2/mamba_ssm/distributed/__init__.py
0.165 mamba_ssm-2.2.2/mamba_ssm/distributed/distributed_utils.py
0.165 mamba_ssm-2.2.2/mamba_ssm/distributed/tensor_parallel.py
0.165 mamba_ssm-2.2.2/mamba_ssm/models/
0.166 mamba_ssm-2.2.2/mamba_ssm/models/__init__.py
0.166 mamba_ssm-2.2.2/mamba_ssm/models/config_mamba.py
0.166 mamba_ssm-2.2.2/mamba_ssm/models/mixer_seq_simple.py
0.166 mamba_ssm-2.2.2/mamba_ssm/modules/
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/__init__.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/block.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/mamba2.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/mamba2_simple.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/mamba_simple.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/mha.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/mlp.py
0.167 mamba_ssm-2.2.2/mamba_ssm/modules/ssd_minimal.py
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/__init__.py
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/selective_scan_interface.py
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/triton/
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/triton/__init__.py
0.167 mamba_ssm-2.2.2/mamba_ssm/ops/triton/k_activations.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/layer_norm.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/layernorm_gated.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/selective_state_update.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/softplus.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/ssd_bmm.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/ssd_chunk_scan.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/ssd_chunk_state.py
0.168 mamba_ssm-2.2.2/mamba_ssm/ops/triton/ssd_combined.py
0.169 mamba_ssm-2.2.2/mamba_ssm/ops/triton/ssd_state_passing.py
0.169 mamba_ssm-2.2.2/mamba_ssm/utils/
0.170 mamba_ssm-2.2.2/mamba_ssm/utils/__init__.py
0.170 mamba_ssm-2.2.2/mamba_ssm/utils/generation.py
0.170 mamba_ssm-2.2.2/mamba_ssm/utils/hf.py
0.170 mamba_ssm-2.2.2/mamba_ssm.egg-info/
0.171 mamba_ssm-2.2.2/mamba_ssm.egg-info/PKG-INFO
0.171 mamba_ssm-2.2.2/mamba_ssm.egg-info/SOURCES.txt
0.171 mamba_ssm-2.2.2/mamba_ssm.egg-info/dependency_links.txt
0.171 mamba_ssm-2.2.2/mamba_ssm.egg-info/requires.txt
0.171 mamba_ssm-2.2.2/mamba_ssm.egg-info/top_level.txt
0.171 mamba_ssm-2.2.2/setup.cfg
0.171 mamba_ssm-2.2.2/setup.py
3.249 No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
3.291 
3.291 
3.291 torch.__version__  = 2.3.1+cu121
3.291 
3.291 
3.291 running build
3.291 running build_py
3.298 creating build
3.298 creating build/lib.linux-x86_64-cpython-310
3.298 creating build/lib.linux-x86_64-cpython-310/mamba_ssm
3.298 copying mamba_ssm/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm
3.299 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/distributed
3.299 copying mamba_ssm/distributed/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/distributed
3.299 copying mamba_ssm/distributed/distributed_utils.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/distributed
3.299 copying mamba_ssm/distributed/tensor_parallel.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/distributed
3.300 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/models
3.300 copying mamba_ssm/models/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/models
3.300 copying mamba_ssm/models/config_mamba.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/models
3.300 copying mamba_ssm/models/mixer_seq_simple.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/models
3.301 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.301 copying mamba_ssm/modules/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.301 copying mamba_ssm/modules/block.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.301 copying mamba_ssm/modules/mamba2.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.301 copying mamba_ssm/modules/mamba2_simple.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.302 copying mamba_ssm/modules/mamba_simple.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.302 copying mamba_ssm/modules/mha.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.302 copying mamba_ssm/modules/mlp.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.302 copying mamba_ssm/modules/ssd_minimal.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/modules
3.303 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/ops
3.303 copying mamba_ssm/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops
3.303 copying mamba_ssm/ops/selective_scan_interface.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops
3.303 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/utils
3.303 copying mamba_ssm/utils/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/utils
3.304 copying mamba_ssm/utils/generation.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/utils
3.304 copying mamba_ssm/utils/hf.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/utils
3.304 creating build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.304 copying mamba_ssm/ops/triton/__init__.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.304 copying mamba_ssm/ops/triton/k_activations.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.305 copying mamba_ssm/ops/triton/layer_norm.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.305 copying mamba_ssm/ops/triton/layernorm_gated.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.305 copying mamba_ssm/ops/triton/selective_state_update.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.305 copying mamba_ssm/ops/triton/softplus.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.305 copying mamba_ssm/ops/triton/ssd_bmm.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.306 copying mamba_ssm/ops/triton/ssd_chunk_scan.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.306 copying mamba_ssm/ops/triton/ssd_chunk_state.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.306 copying mamba_ssm/ops/triton/ssd_combined.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.306 copying mamba_ssm/ops/triton/ssd_state_passing.py -> build/lib.linux-x86_64-cpython-310/mamba_ssm/ops/triton
3.308 running build_ext
3.350 /usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:418: UserWarning: The detected CUDA version (12.4) has a minor version mismatch with the version that was used to compile PyTorch (12.1). Most likely this shouldn't be a problem.
3.350   warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
3.350 /usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:428: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 12.4
3.350   warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
3.350 building 'selective_scan_cuda' extension
3.351 creating /workspace/mamba_ssm-2.2.2/build/temp.linux-x86_64-cpython-310
3.351 creating /workspace/mamba_ssm-2.2.2/build/temp.linux-x86_64-cpython-310/csrc
3.351 creating /workspace/mamba_ssm-2.2.2/build/temp.linux-x86_64-cpython-310/csrc/selective_scan
3.384 Emitting ninja build file /workspace/mamba_ssm-2.2.2/build/temp.linux-x86_64-cpython-310/build.ninja...
3.384 Compiling objects...
3.384 Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
3.411 ninja: error: '/workspace/mamba_ssm-2.2.2/csrc/selective_scan/selective_scan.cpp', needed by '/workspace/mamba_ssm-2.2.2/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/selective_scan.o', missing and no known rule to make it
3.414 Traceback (most recent call last):
3.414   File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2107, in _run_ninja_build
3.414     subprocess.run(
3.414   File "/usr/lib/python3.10/subprocess.py", line 526, in run
3.415     raise CalledProcessError(retcode, process.args,
3.415 subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
3.415 
3.415 The above exception was the direct cause of the following exception:
3.415 
3.415 Traceback (most recent call last):
3.415   File "/workspace/mamba_ssm-2.2.2/setup.py", line 337, in <module>
3.415     setup(
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 103, in setup
3.415     return distutils.core.setup(**attrs)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 184, in setup
3.415     return run_commands(dist)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 200, in run_commands
3.415     dist.run_commands()
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 970, in run_commands
3.415     self.run_command(cmd)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 974, in run_command
3.415     super().run_command(command)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 989, in run_command
3.415     cmd_obj.run()
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build.py", line 135, in run
3.415     self.run_command(cmd_name)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py", line 316, in run_command
3.415     self.distribution.run_command(command)
3.415   File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 974, in run_command
3.416     super().run_command(command)
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 989, in run_command
3.416     cmd_obj.run()
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 93, in run
3.416     _build_ext.run(self)
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
3.416     self.build_extensions()
3.416   File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 870, in build_extensions
3.416     build_ext.build_extensions(self)
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 479, in build_extensions
3.416     self._build_extensions_serial()
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 505, in _build_extensions_serial
3.416     self.build_extension(ext)
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 254, in build_extension
3.416     _build_ext.build_extension(self, ext)
3.416   File "/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/command/build_ext.py", line 560, in build_extension
3.416     objects = self.compiler.compile(
3.416   File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 683, in unix_wrap_ninja_compile
3.417     _write_ninja_file_and_compile_objects(
3.417   File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1783, in _write_ninja_file_and_compile_objects
3.417     _run_ninja_build(
3.417   File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 2123, in _run_ninja_build
3.417     raise RuntimeError(message) from e
3.417 RuntimeError: Error compiling objects for extension
------
NewDockerfile:76
--------------------
  75 |     # RUN pip install --no-index mamba_ssm-1.2.2.tar.gz
  76 | >>> RUN tar -xvf mamba_ssm-2.2.2.tar.gz && cd mamba_ssm-2.2.2 && \
  77 | >>>     python setup.py build && \
  78 | >>>     python setup.py install
  79 |     COPY causal_conv1d-1.2.1.tar.gz causal_conv1d-1.2.1.tar.gz
--------------------
ERROR: failed to solve: process "/bin/sh -c tar -xvf mamba_ssm-2.2.2.tar.gz && cd mamba_ssm-2.2.2 &&     python setup.py build &&     python setup.py install" did not complete successfully: exit code: 1

I'm guessing that the latest torch's torch/utils/cpp_extension.py isn't working successfully anymore？

vllm-project / vllm

[Installation]: ERROR: Could not find a version that satisfies the requirement pyzmq (from versions: none) #6459

Your current environment

How you are installing vllm