MARIO-Math-Reasoning / vllm

Apache License 2.0
1 stars 0 forks source link

Some trouble about installing vllm in MARIO-Math-Reasoning #1

Open hanselcui opened 1 month ago

hanselcui commented 1 month ago

Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:

Building wheels for collected packages: vllm
  Building wheel for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [242 lines of output]
      /tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      No CUDA runtime is found, using CUDA_HOME='/home/hanc/miniconda3/envs/mario'
      running bdist_wheel
      running build
      running build_py
      copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/block.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/config.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/logger.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying tests/core/test_block_manager.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/utils.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/test_scheduler.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/lora/test_layers.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_llama.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_punica.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/conftest.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_layer_variation.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_gemma.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora_manager.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_worker.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_mixtral.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/worker/test_model_runner.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/test_swap.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/tokenization/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_detokenize.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_cached_tokenizer.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/spec_decode/test_metrics.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_spec_decode_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_multi_step_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_batch_expansion.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying vllm/core/evictor.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/model_executor/neuron_model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_decoding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_logits_processors.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/model_executor/parallel_utils/communication_op.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/cupy_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/modeling_value_head.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/attention/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/attention.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/py.typed -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      running build_ext
      CMake Error at CMakeLists.txt:3 (project):
        Running

         '/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version'

        failed with:

         no such file or directory

      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 410, in build_wheel
          return self._build_with_temp_dir(
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 338, in <module>
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
          return distutils.core.setup(**attrs)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
          return run_commands(dist)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run
          self.run_command("build")
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "<string>", line 157, in build_extensions
        File "<string>", line 140, in configure
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/subprocess.py", line 369, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['cmake', '/home/hanc/workspace/Super_MARIO/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/hanc/workspace/Super_MARIO/vllm/build/lib.linux-x86_64-cpython-310/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-310', '-DVLLM_PYTHON_EXECUTABLE=/home/hanc/miniconda3/envs/mario/bin/python', '-DNVCC_THREADS=8', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects

In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?

lovecambi commented 1 month ago

In readme, I indicated that the version of vllm we modified is from vllm@f721096.

lovecambi commented 1 month ago

Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:

Building wheels for collected packages: vllm
  Building wheel for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [242 lines of output]
      /tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      No CUDA runtime is found, using CUDA_HOME='/home/hanc/miniconda3/envs/mario'
      running bdist_wheel
      running build
      running build_py
      copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/block.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/config.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/logger.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying tests/core/test_block_manager.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/utils.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/test_scheduler.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/lora/test_layers.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_llama.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_punica.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/conftest.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_layer_variation.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_gemma.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora_manager.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_worker.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_mixtral.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/worker/test_model_runner.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/test_swap.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/tokenization/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_detokenize.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_cached_tokenizer.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/spec_decode/test_metrics.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_spec_decode_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_multi_step_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_batch_expansion.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying vllm/core/evictor.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/model_executor/neuron_model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_decoding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_logits_processors.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/model_executor/parallel_utils/communication_op.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/cupy_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/modeling_value_head.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/attention/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/attention.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/py.typed -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      running build_ext
      CMake Error at CMakeLists.txt:3 (project):
        Running

         '/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version'

        failed with:

         no such file or directory

      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 410, in build_wheel
          return self._build_with_temp_dir(
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 338, in <module>
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
          return distutils.core.setup(**attrs)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
          return run_commands(dist)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run
          self.run_command("build")
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "<string>", line 157, in build_extensions
        File "<string>", line 140, in configure
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/subprocess.py", line 369, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['cmake', '/home/hanc/workspace/Super_MARIO/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/hanc/workspace/Super_MARIO/vllm/build/lib.linux-x86_64-cpython-310/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-310', '-DVLLM_PYTHON_EXECUTABLE=/home/hanc/miniconda3/envs/mario/bin/python', '-DNVCC_THREADS=8', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects

In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?

Did you pin the version of setuptools? setuptools is recently updated to 70.0.0, which is not compatible with vllm.

https://github.com/MARIO-Math-Reasoning/vllm/blob/main/pyproject.toml#L7 https://github.com/MARIO-Math-Reasoning/vllm/blob/main/requirements-build.txt#L5

hanselcui commented 1 month ago
image

I think I've got the right version of setuptools(69.5.1), but I didn't install this lib properly.

Besides, I've cloned the repo of vllm@f721096, and run pip install . in it. It reported the same traceback. So I think it might be the issue from the original vllm but not your repo?

lovecambi commented 1 month ago
image

I think I've got the right version of setuptools(69.5.1), but I didn't install this lib properly.

Besides, I've cloned the repo of vllm@f721096, and run pip install . in it. It reported the same traceback. So I think it might be the issue from the original vllm but not your repo?

Even if you cloned the original vllm, you still need to pin the version of setuptools in pyproject.toml and requirements-build.txt .

You can also take a look at https://github.com/vllm-project/vllm/issues/4913 .

lrq111 commented 1 month ago

I also encounter this problem, have you solved it?

hanselcui commented 1 month ago

I also encounter this problem, have you solved it?

I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.

lrq111 commented 1 month ago

I also encounter this problem, have you solved it?

I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.

OK, thanks for your reply, I will try it

eurekayuan commented 4 weeks ago

I also encountered the same problem.

eurekayuan commented 4 weeks ago

I also encounter this problem, have you solved it?

I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.

OK, thanks for your reply, I will try it

I solved the issue by cleaning up the old builds: rm -rf build. Reference: https://github.com/vllm-project/vllm/issues/4913

BoyuanJackChen commented 4 weeks ago

Still having this problem. I double-checked that the setuptools version was correct (69.5.1). For some reason it can't find numpy, but I actually have it in pip. Not sure what to do about it. Any ideas?

Processing /scratch/bc/Super_MARIO/vllm
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [19 lines of output]
      /tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      Traceback (most recent call last):
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 340, in <module>
        File "<string>", line 266, in get_vllm_version
        File "<string>", line 237, in get_nvcc_cuda_version
      TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
hanselcui commented 2 weeks ago

Still having this problem. I double-checked that the setuptools version was correct (69.5.1). For some reason it can't find numpy, but I actually have it in pip. Not sure what to do about it. Any ideas?

Processing /scratch/bc/Super_MARIO/vllm
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [19 lines of output]
      /tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      Traceback (most recent call last):
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 340, in <module>
        File "<string>", line 266, in get_vllm_version
        File "<string>", line 237, in get_nvcc_cuda_version
      TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

I've met this issue before. I fixed it by installing it on a machine with GPU and cuda. (But some other issues came out.)

MeishanStupidBoy commented 2 weeks ago

Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:

Building wheels for collected packages: vllm
  Building wheel for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [242 lines of output]
      /tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      No CUDA runtime is found, using CUDA_HOME='/home/hanc/miniconda3/envs/mario'
      running bdist_wheel
      running build
      running build_py
      copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/block.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/config.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/logger.py -> build/lib.linux-x86_64-cpython-310/vllm
      copying tests/core/test_block_manager.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/utils.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/core/test_scheduler.py -> build/lib.linux-x86_64-cpython-310/tests/core
      copying tests/lora/test_layers.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_llama.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_punica.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/conftest.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_layer_variation.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_gemma.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_lora_manager.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_worker.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/lora/test_mixtral.py -> build/lib.linux-x86_64-cpython-310/tests/lora
      copying tests/worker/test_model_runner.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/worker/test_swap.py -> build/lib.linux-x86_64-cpython-310/tests/worker
      copying tests/tokenization/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_detokenize.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/tokenization/test_cached_tokenizer.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization
      copying tests/spec_decode/test_metrics.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_spec_decode_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_multi_step_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying tests/spec_decode/test_batch_expansion.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode
      copying vllm/core/evictor.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-310/vllm/core
      copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/lora
      copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine
      copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker
      copying vllm/model_executor/neuron_model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_decoding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/guided_logits_processors.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor
      copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor
      copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils
      copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints
      copying vllm/model_executor/parallel_utils/communication_op.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/cupy_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils
      copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/modeling_value_head.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models
      copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers
      copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe
      copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops
      copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization
      copying vllm/model_executor/layers/attention/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/attention.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention
      copying vllm/model_executor/layers/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops
      copying vllm/model_executor/layers/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/model_executor/layers/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends
      copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group
      copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs
      copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers
      copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai
      copying vllm/py.typed -> build/lib.linux-x86_64-cpython-310/vllm
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs
      running build_ext
      CMake Error at CMakeLists.txt:3 (project):
        Running

         '/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version'

        failed with:

         no such file or directory

      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 410, in build_wheel
          return self._build_with_temp_dir(
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 338, in <module>
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
          return distutils.core.setup(**attrs)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
          return run_commands(dist)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run
          self.run_command("build")
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "<string>", line 157, in build_extensions
        File "<string>", line 140, in configure
        File "/home/hanc/miniconda3/envs/mario/lib/python3.10/subprocess.py", line 369, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['cmake', '/home/hanc/workspace/Super_MARIO/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/hanc/workspace/Super_MARIO/vllm/build/lib.linux-x86_64-cpython-310/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-310', '-DVLLM_PYTHON_EXECUTABLE=/home/hanc/miniconda3/envs/mario/bin/python', '-DNVCC_THREADS=8', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects

In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?

the key point is as follows:

'/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version'

   failed with:

   no such file or directory

That is because the version of 'ninja' is not right. Before install the vllm program('pip install .'), you need to modify the version of 'ninja' in the requirements-build.txt and then 'pip install -r requirements-build.txt'

(mr11) zzh@E5:~/code/vllm$ cat requirements-build.txt

Should be mirrored in pyproject.toml

cmake>=3.21 ninja==1.10.2.4 packaging setuptools==69.5.1 torch==2.1.2 wheel