PygmalionAI / aphrodite-engine

Large-scale LLM inference engine
https://aphrodite.pygmalion.chat
GNU Affero General Public License v3.0
1.08k stars 118 forks source link

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

Open sorasoras opened 1 week ago

sorasoras commented 1 week ago

Your current environment

The output of `python env.py`

python env.py

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.1.2 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to downgrade to 'numpy<2' or try to upgrade the affected module. We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in import torch File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in from .functional import # noqa: F403 File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in import torch.nn.functional as F File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in from .modules import # noqa: F403 File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in from .transformer import TransformerEncoder, TransformerDecoder, \ File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), /usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_n umpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), Collecting environment information... /usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") PyTorch version: 2.1.2+rocm6.1.3 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.30.4 Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 11.5.119 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 6.1.40093 MIOpen runtime version: 3.1.0 Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 7950X3D 16-Core Processor CPU family: 25 Model: 97 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 2 BogoMIPS: 8399.84 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc re p_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misal ignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx 512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm Virtualization: AMD-V Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 512 KiB (16 instances) L1i cache: 512 KiB (16 instances) L2 cache: 16 MiB (16 instances) L3 cache: 96 MiB (1 instance) Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Mitigation; safe RET Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Versions of relevant libraries: [pip3] numpy==2.1.2 [pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44 [pip3] torch==2.1.2+rocm6.1.3 [pip3] torchvision==0.16.1+rocm6.1.3 [conda] Could not collect ROCM Version: 6.1.40093-bd86f1708 Neuron SDK Version: N/A Aphrodite Version: N/A Aphrodite Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect root@SORANET:/home/sora/aphrodite-engine# sudo python env.py

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.1.2 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to downgrade to 'numpy<2' or try to upgrade the affected module. We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in import torch File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in from .functional import # noqa: F403 File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in import torch.nn.functional as F File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in from .modules import # noqa: F403 File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in from .transformer import TransformerEncoder, TransformerDecoder, \ File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), /usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), Collecting environment information... /usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") PyTorch version: 2.1.2+rocm6.1.3 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.30.4 Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 11.5.119 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 6.1.40093 MIOpen runtime version: 3.1.0 Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 7950X3D 16-Core Processor CPU family: 25 Model: 97 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 2 BogoMIPS: 8399.84 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm Virtualization: AMD-V Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 512 KiB (16 instances) L1i cache: 512 KiB (16 instances) L2 cache: 16 MiB (16 instances) L3 cache: 96 MiB (1 instance) Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Mitigation; safe RET Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Versions of relevant libraries: [pip3] numpy==2.1.2 [pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44 [pip3] torch==2.1.2+rocm6.1.3 [pip3] torchvision==0.16.1+rocm6.1.3 [conda] Could not collect ROCM Version: 6.1.40093-bd86f1708 Neuron SDK Version: N/A Aphrodite Version: N/A Aphrodite Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect

How did you install Aphrodite?

pip install aphrodite-engine

sudo apt update wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb sudo apt install ./amdgpu-install_6.1.60103-1_all.deb

sudo amdgpu-install --list-usecase

If --usecase option is not present, the default selection is "dkms,graphics,opencl,hip" Available use cases: dkms (to only install the kernel mode driver)

rocminfo

HSA System Attributes

Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED DMAbuf Support: NO

========== HSA Agents


Agent 1


Name: CPU Uuid: CPU-XX Marketing Name: CPU Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: Chip ID: 0(0x0) Cacheline Size: 64(0x40) Internal Node ID: 0 Compute Unit: 32 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 49137460(0x2edc734) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 49137460(0x2edc734) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info:


Agent 2


Name: gfx1100 Marketing Name: AMD Radeon RX 7900 XTX Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 16(0x10) Queue Min Size: 4096(0x1000) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 32(0x20) KB L2: 6144(0x1800) KB L3: 98304(0x18000) KB Chip ID: 29772(0x744c) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 2526 Internal Node ID: 1 Compute Unit: 96 SIMDs per CU: 2 Shader Engines: 6 Shader Arrs. per Eng.: 2 Coherent Host Access: FALSE Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Packet Processor uCode:: 2280 SDMA engine uCode:: 21 IOMMU Support:: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 25086124(0x17ec8ac) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:2048KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Recommended Granule:0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1100 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 Done

build log

rocm_gfx1100_wsl2.txt

Naomiusearch commented 6 days ago

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

sorasoras commented 5 days ago

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

I did try to compile but the patch doesn't seems to work

Naomiusearch commented 5 days ago

What error did you get?

EDIT: Sorry, I have the stupid and didn't notice you gave build logs

Naomiusearch commented 5 days ago

Your pytorch is way too out of date. Could you update to 2.5?

sorasoras commented 5 days ago

Your pytorch is way too out of date. Could you update to 2.5?

I update it to pytorch 2.5, and now it show

~/aphrodite-engine$ ./amdpatch.sh
patching file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 397.
1 out of 1 hunk FAILED -- saving rejects to file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h.rej
patch: **** Can't reopen file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h : No such file or directory
Naomiusearch commented 2 days ago

So first check if aphrodite compiles. If it fails like "test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

sorasoras commented 2 days ago

So first check if aphrodite compiles. If it fails like "test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

I cannot compile it and when I check the exact folder.

root@x:/opt/rocm/lib/llvm/lib/clang/18/include# dir clang_hip_cmath.h.orig clang_hip_cmath.h.rej there is no __clang_hip_cmath.h under that folder

Naomiusearch commented 2 days ago

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

sorasoras commented 1 day ago

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

clang 17 should work. looks like 17 is the only one exist.

Naomiusearch commented 1 day ago

Does this work? sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

sorasoras commented 23 hours ago

Does this work? sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

Not quite.

cd aphrodite-engine/

root@SORANET:~/aphrodite-engine# sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
patching file /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 397 with fuzz 2.

and I try to run it anyway.

export PYTORCH_ROCM_ARCH=gfx1100

python3 setup.py develop
running develop
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing aphrodite_engine.egg-info/PKG-INFO
writing dependency_links to aphrodite_engine.egg-info/dependency_links.txt
writing entry points to aphrodite_engine.egg-info/entry_points.txt
writing requirements to aphrodite_engine.egg-info/requires.txt
writing top-level names to aphrodite_engine.egg-info/top_level.txt
reading manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
running build_ext
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found python matching: /usr/bin/python3.
Building PyTorch for GPU arch: gfx1100
HIP VERSION: 6.1.40093-bd86f1708
-- Caffe2: Header version is: 6.1.3

***** ROCm version from rocm_version.h ****

ROCM_VERSION_DEV: 6.1.3
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 1
ROCM_VERSION_DEV_PATCH: 3
ROCM_VERSION_DEV_INT:   60103
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 1
TORCH_HIP_VERSION: 601

***** Library versions from dpkg *****

rocm-developer-tools VERSION: 6.1.3.60103-122~22.04
rocm-device-libs VERSION: 1.0.0.60103-122~22.04
hsakmt-roct-dev VERSION: 20240125.5.08.60103-122~22.04
hsa-rocr-dev VERSION: 1.13.0.60103-122~22.04

***** Library versions from cmake find_package *****

CMake Error at /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake:80 (message):
  The imported target "hsa-runtime64::hsa-runtime64" references the file

     "/opt/rocm/lib/libhsa-runtime64.so.1.13.60103"

  but this file does not exist.  Possible reasons include:

  * The file was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and contained

     "/opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake"

  but not all the files it references.

Call Stack (most recent call first):
  /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64-config.cmake:82 (include)
  /usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.30/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
  /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:108 (find_dependency)
  /opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:36 (find_package)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:152 (find_package_and_print_version)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:74 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:67 (find_package)

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/root/aphrodite-engine/setup.py", line 460, in <module>
    setup(
  File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/root/aphrodite-engine/setup.py", line 210, in build_extensions
    self.configure(ext)
  File "/root/aphrodite-engine/setup.py", line 193, in configure
    subprocess.check_call(
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/root/aphrodite-engine', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/root/aphrodite-engine/build/lib.linux-x86_64-3.10/aphrodite', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-3.10', '-DAPHRODITE_TARGET_DEVICE=cuda', '-DAPHRODITE_PYTHON_EXECUTABLE=/usr/bin/python3', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=32']' returned non-zero exit status 1.
Naomiusearch commented 22 hours ago

So patch actually works, but you're missing a different file for some reason. I'm not sure if it's a WSL thing. I guess you could try to do this, though I have no idea if it will work.

location=`pip show torch | grep Location | awk -F ": " '{print $2}'
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so

Honestly, I don't recommend using WSL and if you can, you should probably just dualboot linux for ROCM