vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.28k stars 4.38k forks source link

[Installation]: ERROR: Failed to build installable wheels for some pyproject.toml based projects (vllm, xformers, vllm-nccl-cu12) #7135

Open chaoskklt opened 2 months ago

chaoskklt commented 2 months ago

Your current environment

Python version: 3.12.3
PyTorch version: 2.3.1+cu121

How you are installing vllm

pip install vllm

Building wheels for collected packages: vllm, xformers, vllm-nccl-cu12 Building wheel for vllm (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for vllm (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [339 lines of output] /mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/dist.py:447: SetuptoolsDeprecationWarning: Invalid dash-separated options !!

          ********************************************************************************
          Usage of dash-separated 'index-url' will not be supported in future
          versions. Please use the underscore name 'index_url' instead.

          By 2024-Sep-26, you need to update your project and remove deprecated calls
          or your builds will no longer be supported.

          See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
          ********************************************************************************

  !!
    opt = self.warn_dash_deprecation(opt, section)
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-312
  creating build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/logger.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/config.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/utils.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/envs.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/block.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-312/vllm
  copying vllm/_custom_ops.py -> build/lib.linux-x86_64-cpython-312/vllm
  creating build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  copying vllm/lora/fully_sharded_layers.py -> build/lib.linux-x86_64-cpython-312/vllm/lora
  creating build/lib.linux-x86_64-cpython-312/vllm/transformers_utils
  copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils
  copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils
  copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils
  copying vllm/transformers_utils/detokenizer.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils
  creating build/lib.linux-x86_64-cpython-312/vllm/engine
  copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-312/vllm/engine
  copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-312/vllm/engine
  copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-312/vllm/engine
  copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/engine
  copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-312/vllm/engine
  creating build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/multiproc_worker_utils.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/distributed_gpu_executor.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/cpu_executor.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/ray_utils.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-312/vllm/executor
  creating build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/worker_base.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/cpu_worker.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/cpu_model_runner.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-312/vllm/worker
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor
  copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor
  copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor
  copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor
  creating build/lib.linux-x86_64-cpython-312/vllm/entrypoints
  copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints
  copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints
  copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints
  creating build/lib.linux-x86_64-cpython-312/vllm/logging
  copying vllm/logging/formatter.py -> build/lib.linux-x86_64-cpython-312/vllm/logging
  copying vllm/logging/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/logging
  creating build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/interfaces.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/evictor_v2.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/block_manager_v2.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/evictor_v1.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  copying vllm/core/block_manager_v1.py -> build/lib.linux-x86_64-cpython-312/vllm/core
  creating build/lib.linux-x86_64-cpython-312/vllm/usage
  copying vllm/usage/usage_lib.py -> build/lib.linux-x86_64-cpython-312/vllm/usage
  copying vllm/usage/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/usage
  creating build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/interfaces.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/util.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/top1_proposer.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/spec_decode_worker.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/metrics.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/multi_step_worker.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/ngram_worker.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  copying vllm/spec_decode/batch_expansion.py -> build/lib.linux-x86_64-cpython-312/vllm/spec_decode
  creating build/lib.linux-x86_64-cpython-312/vllm/attention
  copying vllm/attention/selector.py -> build/lib.linux-x86_64-cpython-312/vllm/attention
  copying vllm/attention/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/attention
  copying vllm/attention/layer.py -> build/lib.linux-x86_64-cpython-312/vllm/attention
  creating build/lib.linux-x86_64-cpython-312/vllm/distributed
  copying vllm/distributed/utils.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed
  copying vllm/distributed/communication_op.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed
  copying vllm/distributed/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed
  copying vllm/distributed/parallel_state.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed
  creating build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizer_group
  copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizer_group
  copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizer_group
  copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizer_group
  copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizer_group
  creating build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizers
  copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizers
  copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/tokenizers
  creating build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/dbrx.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-312/vllm/transformers_utils/configs
  creating build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/interfaces.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/single_step.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/util.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/stop_checker.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  copying vllm/engine/output_processor/multi_step.py -> build/lib.linux-x86_64-cpython-312/vllm/engine/output_processor
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/utils.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/weight_utils.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/tensorizer.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/loader.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/neuron.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  copying vllm/model_executor/model_loader/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/model_loader
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/guided_decoding
  copying vllm/model_executor/guided_decoding/outlines_decoding.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/guided_decoding
  copying vllm/model_executor/guided_decoding/lm_format_enforcer_decoding.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/guided_decoding
  copying vllm/model_executor/guided_decoding/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/guided_decoding
  copying vllm/model_executor/guided_decoding/outlines_logits_processors.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/guided_decoding
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/minicpm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/llava.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/dbrx.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/qwen2_moe.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/commandr.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/xverse.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/models
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/ops
  copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/ops
  copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/ops
  copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/ops
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/aqlm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/gptq_marlin.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/fp8.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/schema.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/quantization
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe
  copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe
  copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe
  creating build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-312/vllm/entrypoints/openai
  creating build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/interfaces.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/cpu_gpu_block_allocator.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/naive_block.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/common.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/block_table.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  copying vllm/core/block/prefix_caching_block.py -> build/lib.linux-x86_64-cpython-312/vllm/core/block
  creating build/lib.linux-x86_64-cpython-312/vllm/attention/ops
  copying vllm/attention/ops/triton_flash_attention.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/ops
  copying vllm/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/ops
  copying vllm/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/ops
  copying vllm/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/ops
  creating build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/abstract.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/rocm_flash_attn.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/flashinfer.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  copying vllm/attention/backends/torch_sdpa.py -> build/lib.linux-x86_64-cpython-312/vllm/attention/backends
  creating build/lib.linux-x86_64-cpython-312/vllm/distributed/device_communicators
  copying vllm/distributed/device_communicators/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed/device_communicators
  copying vllm/distributed/device_communicators/pynccl.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed/device_communicators
  copying vllm/distributed/device_communicators/pynccl_utils.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed/device_communicators
  copying vllm/distributed/device_communicators/__init__.py -> build/lib.linux-x86_64-cpython-312/vllm/distributed/device_communicators
  copying vllm/py.typed -> build/lib.linux-x86_64-cpython-312/vllm
  creating build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=16,N=2688,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=2048,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=16,N=2688,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=2048,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=4096,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=4096,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-312/vllm/model_executor/layers/fused_moe/configs
  running build_ext
  -- The CXX compiler identification is GNU 10.2.1
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Build type: RelWithDebInfo
  -- Target device: cuda
  -- Found Python: /mnt/miniconda3/bin/python (found version "3.12.3") found components: Interpreter Development.Module
  CMake Error at cmake/utils.cmake:15 (message):
    Python version (3.12) is not one of the supported versions:
    3.8;3.9;3.10;3.11.
  Call Stack (most recent call first):
    CMakeLists.txt:43 (find_python_from_executable)

  -- Configuring incomplete, errors occurred!
  Traceback (most recent call last):
    File "/mnt/miniconda3/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/mnt/miniconda3/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/miniconda3/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
      return _build_backend().build_wheel(wheel_directory, config_settings,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 415, in build_wheel
      return self._build_with_temp_dir(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
      self.run_setup()
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/build_meta.py", line 313, in run_setup
      exec(code, locals())
    File "<string>", line 397, in <module>
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/__init__.py", line 108, in setup
      return distutils.core.setup(**attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
             ^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 970, in run_commands
      self.run_command(cmd)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/command/bdist_wheel.py", line 373, in run
      self.run_command("build")
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/dist.py", line 945, in run_command
      super().run_command(command)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
      cmd_obj.run()
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/command/build_ext.py", line 93, in run
      _build_ext.run(self)
    File "/mnt/sd3/pip-build-env-vtc555m0/overlay/lib/python3.12/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
      self.build_extensions()
    File "<string>", line 192, in build_extensions
    File "<string>", line 175, in configure
    File "/mnt/miniconda3/lib/python3.12/subprocess.py", line 413, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['cmake', '/mnt/sd3/pip-install-swmiw4rm/vllm_daf0194c4f3849a1a7f646dabb866bbf', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/mnt/sd3/pip-install-swmiw4rm/vllm_daf0194c4f3849a1a7f646dabb866bbf/build/lib.linux-x86_64-cpython-312/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-312', '-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=/mnt/miniconda3/bin/python', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=16']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for vllm Building wheel for xformers (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [52 lines of output] Looks like we are using CUDA 12.1 which segfaults when provided with the -generate-line-info flag. Disabling it. Looks like we are using CUDA 12.1 which segfaults when provided with the -generate-line-info flag. Disabling it. fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). /mnt/sd3/setuptools/dist.py:476: SetuptoolsDeprecationWarning: Invalid dash-separated options !!

          ********************************************************************************
          Usage of dash-separated 'index-url' will not be supported in future
          versions. Please use the underscore name 'index_url' instead.

          By 2024-Sep-26, you need to update your project and remove deprecated calls
          or your builds will no longer be supported.

          See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
          ********************************************************************************

  !!
    opt = self.warn_dash_deprecation(opt, section)
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/mnt/sd3/pip-install-swmiw4rm/xformers_df52359e03934625bed1e6135371227b/setup.py", line 485, in <module>
      setuptools.setup(
    File "/mnt/sd3/setuptools/__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/core.py", line 171, in setup
      ok = dist.parse_command_line()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/dist.py", line 476, in parse_command_line
      args = self._parse_command_opts(parser, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/dist.py", line 870, in _parse_command_opts
      nargs = _Distribution._parse_command_opts(self, parser, args)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/dist.py", line 535, in _parse_command_opts
      cmd_class = self.get_command_class(command)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/dist.py", line 715, in get_command_class
      self.cmdclass[command] = cmdclass = ep.load()
                                          ^^^^^^^^^
    File "/mnt/miniconda3/lib/python3.12/importlib/metadata/__init__.py", line 205, in load
      module = import_module(match.group('module'))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/miniconda3/lib/python3.12/importlib/__init__.py", line 90, in import_module
      return _bootstrap._gcd_import(name[level:], package, level)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    File "<frozen importlib._bootstrap>", line 1324, in _find_and_load_unlocked
  ModuleNotFoundError: No module named 'setuptools.command.bdist_wheel'
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for xformers Running setup.py clean for xformers Building wheel for vllm-nccl-cu12 (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [50 lines of output] nccl package already exists at /root/.config/vllm/nccl/cu12/libnccl.so.2.18.1 md5 hash of downloaded file matches expected hash /mnt/sd3/setuptools/dist.py:476: SetuptoolsDeprecationWarning: Invalid dash-separated options !!

          ********************************************************************************
          Usage of dash-separated 'index-url' will not be supported in future
          versions. Please use the underscore name 'index_url' instead.

          By 2024-Sep-26, you need to update your project and remove deprecated calls
          or your builds will no longer be supported.

          See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
          ********************************************************************************

  !!
    opt = self.warn_dash_deprecation(opt, section)
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/mnt/sd3/pip-install-swmiw4rm/vllm-nccl-cu12_15fb9ff3eaed4e80affa43b988747706/setup.py", line 92, in <module>
      setup(
    File "/mnt/sd3/setuptools/__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/core.py", line 171, in setup
      ok = dist.parse_command_line()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/dist.py", line 476, in parse_command_line
      args = self._parse_command_opts(parser, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/dist.py", line 870, in _parse_command_opts
      nargs = _Distribution._parse_command_opts(self, parser, args)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/_distutils/dist.py", line 535, in _parse_command_opts
      cmd_class = self.get_command_class(command)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/sd3/setuptools/dist.py", line 715, in get_command_class
      self.cmdclass[command] = cmdclass = ep.load()
                                          ^^^^^^^^^
    File "/mnt/miniconda3/lib/python3.12/importlib/metadata/__init__.py", line 205, in load
      module = import_module(match.group('module'))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/mnt/miniconda3/lib/python3.12/importlib/__init__.py", line 90, in import_module
      return _bootstrap._gcd_import(name[level:], package, level)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
    File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
    File "<frozen importlib._bootstrap>", line 1324, in _find_and_load_unlocked
  ModuleNotFoundError: No module named 'setuptools.command.bdist_wheel'
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for vllm-nccl-cu12 Running setup.py clean for vllm-nccl-cu12 Failed to build vllm xformers vllm-nccl-cu12 ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (vllm, xformers, vllm-nccl-cu12)

linboyang commented 2 months ago

You may check your python version because I see this error. "CMake Error at cmake/utils.cmake:15 (message): Python version (3.12) is not one of the supported versions: 3.8;3.9;3.10;3.11. Call Stack (most recent call first): CMakeLists.txt:43 (find_python_from_executable)"

chaoskklt commented 2 months ago

Thank you for your reply, so do I need to lower the python version to adapt to vllm? Can you provide a quick installation command?