Open hanselcui opened 6 months ago
In readme, I indicated that the version of vllm we modified is from vllm@f721096.
Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:
Building wheels for collected packages: vllm Building wheel for vllm (pyproject.toml) ... error error: subprocess-exited-with-error × Building wheel for vllm (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [242 lines of output] /tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), No CUDA runtime is found, using CUDA_HOME='/home/hanc/miniconda3/envs/mario' running bdist_wheel running build running build_py copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/block.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/utils.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/config.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/logger.py -> build/lib.linux-x86_64-cpython-310/vllm copying tests/core/test_block_manager.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/utils.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/test_scheduler.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/lora/test_layers.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_llama.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_punica.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/conftest.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_layer_variation.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_gemma.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_lora.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_lora_manager.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_worker.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_mixtral.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/worker/test_model_runner.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/worker/test_swap.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/tokenization/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_detokenize.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_cached_tokenizer.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/spec_decode/test_metrics.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_spec_decode_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_multi_step_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_batch_expansion.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying vllm/core/evictor.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/model_executor/neuron_model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/guided_decoding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/guided_logits_processors.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/model_executor/parallel_utils/communication_op.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/cupy_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/modeling_value_head.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/attention/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention copying vllm/model_executor/layers/attention/attention.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention copying vllm/model_executor/layers/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/model_executor/layers/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/model_executor/layers/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/py.typed -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs running build_ext CMake Error at CMakeLists.txt:3 (project): Running '/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version' failed with: no such file or directory -- Configuring incomplete, errors occurred! Traceback (most recent call last): File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module> main() File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel return _build_backend().build_wheel(wheel_directory, config_settings, File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 410, in build_wheel return self._build_with_temp_dir( File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir self.run_setup() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "<string>", line 338, in <module> File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup return distutils.core.setup(**attrs) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup return run_commands(dist) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands dist.run_commands() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands self.run_command(cmd) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run self.run_command("build") File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command self.distribution.run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run self.run_command(cmd_name) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command self.distribution.run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run _build_ext.run(self) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run self.build_extensions() File "<string>", line 157, in build_extensions File "<string>", line 140, in configure File "/home/hanc/miniconda3/envs/mario/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '/home/hanc/workspace/Super_MARIO/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/hanc/workspace/Super_MARIO/vllm/build/lib.linux-x86_64-cpython-310/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-310', '-DVLLM_PYTHON_EXECUTABLE=/home/hanc/miniconda3/envs/mario/bin/python', '-DNVCC_THREADS=8', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1. [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for vllm Failed to build vllm ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects
In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?
Did you pin the version of setuptools? setuptools is recently updated to 70.0.0, which is not compatible with vllm.
https://github.com/MARIO-Math-Reasoning/vllm/blob/main/pyproject.toml#L7 https://github.com/MARIO-Math-Reasoning/vllm/blob/main/requirements-build.txt#L5
I think I've got the right version of setuptools(69.5.1), but I didn't install this lib properly.
Besides, I've cloned the repo of vllm@f721096, and run pip install .
in it. It reported the same traceback. So I think it might be the issue from the original vllm but not your repo?
I think I've got the right version of setuptools(69.5.1), but I didn't install this lib properly.
Besides, I've cloned the repo of vllm@f721096, and run
pip install .
in it. It reported the same traceback. So I think it might be the issue from the original vllm but not your repo?
Even if you cloned the original vllm, you still need to pin the version of setuptools in pyproject.toml and requirements-build.txt .
You can also take a look at https://github.com/vllm-project/vllm/issues/4913 .
I also encounter this problem, have you solved it?
I also encounter this problem, have you solved it?
I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.
I also encounter this problem, have you solved it?
I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.
OK, thanks for your reply, I will try it
I also encountered the same problem.
I also encounter this problem, have you solved it?
I change an another machine, and install vllm successfully. What we can be sure of now is that the problem lies in the local environment, or the adaptation of the environment to that version of vllm, and it should not be a problem with this library.
OK, thanks for your reply, I will try it
I solved the issue by cleaning up the old builds: rm -rf build
. Reference: https://github.com/vllm-project/vllm/issues/4913
Still having this problem. I double-checked that the setuptools version was correct (69.5.1). For some reason it can't find numpy, but I actually have it in pip. Not sure what to do about it. Any ideas?
Processing /scratch/bc/Super_MARIO/vllm
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Traceback (most recent call last):
File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 340, in <module>
File "<string>", line 266, in get_vllm_version
File "<string>", line 237, in get_nvcc_cuda_version
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Still having this problem. I double-checked that the setuptools version was correct (69.5.1). For some reason it can't find numpy, but I actually have it in pip. Not sure what to do about it. Any ideas?
Processing /scratch/bc/Super_MARIO/vllm Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> [19 lines of output] /tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), Traceback (most recent call last): File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module> main() File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/scratch/bc3194/conda-envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel return hook(config_settings) File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmpdata/pip-build-env-ml44wjni/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "<string>", line 340, in <module> File "<string>", line 266, in get_vllm_version File "<string>", line 237, in get_nvcc_cuda_version TypeError: unsupported operand type(s) for +: 'NoneType' and 'str' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip.
I've met this issue before. I fixed it by installing it on a machine with GPU and cuda. (But some other issues came out.)
Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:
Building wheels for collected packages: vllm Building wheel for vllm (pyproject.toml) ... error error: subprocess-exited-with-error × Building wheel for vllm (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [242 lines of output] /tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), No CUDA runtime is found, using CUDA_HOME='/home/hanc/miniconda3/envs/mario' running bdist_wheel running build running build_py copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/block.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/utils.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/test_utils.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/config.py -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/logger.py -> build/lib.linux-x86_64-cpython-310/vllm copying tests/core/test_block_manager.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/utils.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/core/test_scheduler.py -> build/lib.linux-x86_64-cpython-310/tests/core copying tests/lora/test_layers.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_llama.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_punica.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/conftest.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_layer_variation.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_gemma.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_lora.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_lora_manager.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_worker.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/lora/test_mixtral.py -> build/lib.linux-x86_64-cpython-310/tests/lora copying tests/worker/test_model_runner.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/worker/test_swap.py -> build/lib.linux-x86_64-cpython-310/tests/worker copying tests/tokenization/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_detokenize.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/tokenization/test_cached_tokenizer.py -> build/lib.linux-x86_64-cpython-310/tests/tokenization copying tests/spec_decode/test_metrics.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_spec_decode_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_multi_step_worker.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_utils.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying tests/spec_decode/test_batch_expansion.py -> build/lib.linux-x86_64-cpython-310/tests/spec_decode copying vllm/core/evictor.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-310/vllm/core copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-310/vllm/lora copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/engine copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-310/vllm/worker copying vllm/model_executor/neuron_model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/guided_decoding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/guided_logits_processors.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-310/vllm/executor copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints copying vllm/model_executor/parallel_utils/communication_op.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/parallel_utils/cupy_utils.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/parallel_utils copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/modeling_value_head.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/models copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/ops copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/quantization copying vllm/model_executor/layers/attention/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention copying vllm/model_executor/layers/attention/attention.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention copying vllm/model_executor/layers/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/ops copying vllm/model_executor/layers/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/model_executor/layers/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/model_executor/layers/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/attention/backends copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizer_group copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/configs copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/transformers_utils/tokenizers copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-310/vllm/entrypoints/openai copying vllm/py.typed -> build/lib.linux-x86_64-cpython-310/vllm copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-310/vllm/model_executor/layers/fused_moe/configs running build_ext CMake Error at CMakeLists.txt:3 (project): Running '/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version' failed with: no such file or directory -- Configuring incomplete, errors occurred! Traceback (most recent call last): File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module> main() File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/home/hanc/miniconda3/envs/mario/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel return _build_backend().build_wheel(wheel_directory, config_settings, File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 410, in build_wheel return self._build_with_temp_dir( File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir self.run_setup() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "<string>", line 338, in <module> File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup return distutils.core.setup(**attrs) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup return run_commands(dist) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands dist.run_commands() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands self.run_command(cmd) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run self.run_command("build") File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command self.distribution.run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run self.run_command(cmd_name) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command self.distribution.run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command super().run_command(command) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 91, in run _build_ext.run(self) File "/tmp/pip-build-env-wcxbdxnq/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run self.build_extensions() File "<string>", line 157, in build_extensions File "<string>", line 140, in configure File "/home/hanc/miniconda3/envs/mario/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '/home/hanc/workspace/Super_MARIO/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/hanc/workspace/Super_MARIO/vllm/build/lib.linux-x86_64-cpython-310/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-310', '-DVLLM_PYTHON_EXECUTABLE=/home/hanc/miniconda3/envs/mario/bin/python', '-DNVCC_THREADS=8', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1. [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for vllm Failed to build vllm ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects
In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?
the key point is as follows:
'/tmp/pip-build-env-nbudjmwn/overlay/bin/ninja' '--version'
failed with: no such file or directory
That is because the version of 'ninja' is not right. Before install the vllm program('pip install .'), you need to modify the version of 'ninja' in the requirements-build.txt and then 'pip install -r requirements-build.txt'
(mr11) zzh@E5:~/code/vllm$ cat requirements-build.txt
Should be mirrored in pyproject.toml
cmake>=3.21 ninja==1.10.2.4 packaging setuptools==69.5.1 torch==2.1.2 wheel
I encountered the same issue on two different machines using the specified PyTorch 2.1.2, CUDA 12.1, and setuptools. I hope the project maintainers can clean up the code or provide clearer installation instructions.
I encountered the same issue on two different machines using the specified PyTorch 2.1.2, CUDA 12.1, and setuptools. I hope the project maintainers can clean up the code or provide clearer installation instructions.
The dependencies of this project are very fragile; I successfully installed it in a strict CUDA 12.1 environment.
I encountered the same issue on two different machines using the specified PyTorch 2.1.2, CUDA 12.1, and setuptools. I hope the project maintainers can clean up the code or provide clearer installation instructions.
The dependencies of this project are very fragile; I successfully installed it in a strict CUDA 12.1 environment.
Basically, the installation of vllm in this repo is identical to the original vllm v0.3.3. As readme mentioned, the exact commit we used is vllm@f721096.
If you can install the original vllm v0.3.3, you should be able to install this repo. So I recommend an alternative installation method.
git clone https://github.com/vllm-project/vllm.git
cd vllm
git checkout f721096
pip install -e .
Thanks for publishing this customized version of vllm. According to the readme.md, I tried to install it and found some problems. The error message is as follows:
In addition, when I tried to modify it manually according to the instructions in readme.md, I found that the current vllm version is different from the one in this library. Which release version of vllm should I modify based on?