vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.7k stars 4.66k forks source link

[Bug]: Intel GPU Test failing in CI #6591

Open tdoublep opened 4 months ago

tdoublep commented 4 months ago

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

All CI builds are failing due to the Intel GPU Test failing (see log below)

Can we fix or "soft fail" it?

bash .buildkite/run-xpu-test.sh
--
  | + docker build -t xpu-test -f Dockerfile.xpu .
  | [+] Building 2.5s (6/11)                                                                                                                         docker:default
  | => [internal] load build definition from Dockerfile.xpu                                                                                                   0.0s
  | => => transferring dockerfile: 1.26kB                                                                                                                     0.0s
  | => [internal] load metadata for docker.io/intel/oneapi-basekit:2024.1.0-devel-ubuntu20.04                                                                 0.3s
  | => [internal] load .dockerignore                                                                                                                          0.0s
  | => => transferring context: 50B                                                                                                                           0.0s
  | => CACHED [1/7] FROM docker.io/intel/oneapi-basekit:2024.1.0-devel-ubuntu20.04@sha256:6adb5e03caac52ed86bc58163647d4a02e8c9220764ea5f0555aa72f63d86d13    0.0s
  | => [internal] load build context                                                                                                                          0.1s
  | => => transferring context: 959.30kB                                                                                                                      0.1s
  | => ERROR [2/7] RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \| gpg --dearmor \| tee /usr/share/keyrings/intel  2.2s
  | ------
  | > [2/7] RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \| gpg --dearmor \| tee /usr/share/keyrings/intel-oneapi-archive-keyring.gpg > /dev/null &&     echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " \| tee /etc/apt/sources.list.d/oneAPI.list &&     chmod 644 /usr/share/keyrings/intel-oneapi-archive-keyring.gpg &&     rm /etc/apt/sources.list.d/intel-graphics.list &&     wget -O- https://repositories.intel.com/graphics/intel-graphics.key \| gpg --dearmor \| tee /usr/share/keyrings/intel-graphics.gpg > /dev/null &&     echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc" \| tee /etc/apt/sources.list.d/intel.gpu.jammy.list &&     chmod 644 /usr/share/keyrings/intel-graphics.gpg:
  | 0.295 --2024-07-19 19:10:44--  https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
  | 0.297 Resolving apt.repos.intel.com (apt.repos.intel.com)... 23.59.151.15, 2600:1409:9800:1689::a87, 2600:1409:9800:168c::a87
  | 0.333 Connecting to apt.repos.intel.com (apt.repos.intel.com)\|23.59.151.15\|:443... connected.
  | 0.350 HTTP request sent, awaiting response... 200 OK
  | 0.375 Length: 4738 (4.6K) [application/vnd.exstream-package]
  | 0.375 Saving to: ‘STDOUT’
  | 0.375
  | 0.375      0K ....                                                  100% 1.53G=0s
  | 0.375
  | 0.375 2024-07-19 19:10:44 (1.53 GB/s) - written to stdout [4738/4738]
  | 0.375
  | 0.380 deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main
  | 0.385 rm: cannot remove '/etc/apt/sources.list.d/intel-graphics.list': No such file or directory
liuxingbin commented 3 months ago

have you solved the problem? I meet the same one.

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!