Open sorasoras opened 1 month ago
Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.
Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.
I did try to compile but the patch doesn't seems to work
What error did you get?
EDIT: Sorry, I have the stupid and didn't notice you gave build logs
Your pytorch is way too out of date. Could you update to 2.5?
Your pytorch is way too out of date. Could you update to 2.5?
I update it to pytorch 2.5, and now it show
~/aphrodite-engine$ ./amdpatch.sh
patching file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 397.
1 out of 1 hunk FAILED -- saving rejects to file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h.rej
patch: **** Can't reopen file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h : No such file or directory
So first check if aphrodite compiles. If it fails like "test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.
So first check if aphrodite compiles. If it fails like "test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.
I cannot compile it and when I check the exact folder.
root@x:/opt/rocm/lib/llvm/lib/clang/18/include# dir clang_hip_cmath.h.orig clang_hip_cmath.h.rej there is no __clang_hip_cmath.h under that folder
Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/
Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/
clang 17 should work. looks like 17 is the only one exist.
Does this work?
sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
Does this work?
sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
Not quite.
cd aphrodite-engine/
root@SORANET:~/aphrodite-engine# sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
patching file /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 397 with fuzz 2.
and I try to run it anyway.
export PYTORCH_ROCM_ARCH=gfx1100
python3 setup.py develop
running develop
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running egg_info
writing aphrodite_engine.egg-info/PKG-INFO
writing dependency_links to aphrodite_engine.egg-info/dependency_links.txt
writing entry points to aphrodite_engine.egg-info/entry_points.txt
writing requirements to aphrodite_engine.egg-info/requires.txt
writing top-level names to aphrodite_engine.egg-info/top_level.txt
reading manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
running build_ext
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found python matching: /usr/bin/python3.
Building PyTorch for GPU arch: gfx1100
HIP VERSION: 6.1.40093-bd86f1708
-- Caffe2: Header version is: 6.1.3
***** ROCm version from rocm_version.h ****
ROCM_VERSION_DEV: 6.1.3
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 1
ROCM_VERSION_DEV_PATCH: 3
ROCM_VERSION_DEV_INT: 60103
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 1
TORCH_HIP_VERSION: 601
***** Library versions from dpkg *****
rocm-developer-tools VERSION: 6.1.3.60103-122~22.04
rocm-device-libs VERSION: 1.0.0.60103-122~22.04
hsakmt-roct-dev VERSION: 20240125.5.08.60103-122~22.04
hsa-rocr-dev VERSION: 1.13.0.60103-122~22.04
***** Library versions from cmake find_package *****
CMake Error at /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake:80 (message):
The imported target "hsa-runtime64::hsa-runtime64" references the file
"/opt/rocm/lib/libhsa-runtime64.so.1.13.60103"
but this file does not exist. Possible reasons include:
* The file was deleted, renamed, or moved to another location.
* An install or uninstall procedure did not complete successfully.
* The installation package was faulty and contained
"/opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake"
but not all the files it references.
Call Stack (most recent call first):
/opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64-config.cmake:82 (include)
/usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.30/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
/opt/rocm/lib/cmake/hip/hip-config-amd.cmake:108 (find_dependency)
/opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:36 (find_package)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:152 (find_package_and_print_version)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:74 (include)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:67 (find_package)
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "/root/aphrodite-engine/setup.py", line 460, in <module>
setup(
File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 34, in run
self.install_for_development()
File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 114, in install_for_development
self.run_command('build_ext')
File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/root/aphrodite-engine/setup.py", line 210, in build_extensions
self.configure(ext)
File "/root/aphrodite-engine/setup.py", line 193, in configure
subprocess.check_call(
File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/root/aphrodite-engine', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/root/aphrodite-engine/build/lib.linux-x86_64-3.10/aphrodite', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-3.10', '-DAPHRODITE_TARGET_DEVICE=cuda', '-DAPHRODITE_PYTHON_EXECUTABLE=/usr/bin/python3', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=32']' returned non-zero exit status 1.
So patch actually works, but you're missing a different file for some reason. I'm not sure if it's a WSL thing. I guess you could try to do this, though I have no idea if it will work.
location=`pip show torch | grep Location | awk -F ": " '{print $2}'
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so
Honestly, I don't recommend using WSL and if you can, you should probably just dualboot linux for ROCM
So patch actually works, but you're missing a different file for some reason. I'm not sure if it's a WSL thing. I guess you could try to do this, though I have no idea if it will work.
location=`pip show torch | grep Location | awk -F ": " '{print $2}' cd ${location}/torch/lib/ rm libhsa-runtime64.so* cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so
Honestly, I don't recommend using WSL and if you can, you should probably just dualboot linux for ROCM
There is a workaround for this. https://github.com/ROCm/ROCm/issues/3606
cd /opt/rocm/lib/
ln -s libhsa-runtime64.so.1.2 libhsa-runtime64.so.1.13.60103
I encounter some issue during compile
python3 setup.py develop
running develop
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running egg_info
writing aphrodite_engine.egg-info/PKG-INFO
writing dependency_links to aphrodite_engine.egg-info/dependency_links.txt
writing entry points to aphrodite_engine.egg-info/entry_points.txt
writing requirements to aphrodite_engine.egg-info/requires.txt
writing top-level names to aphrodite_engine.egg-info/top_level.txt
reading manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
running build_ext
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found python matching: /usr/bin/python3.
Building PyTorch for GPU arch: gfx1100
HIP VERSION: 6.1.40093-bd86f1708
-- Caffe2: Header version is: 6.1.3
***** ROCm version from rocm_version.h ****
ROCM_VERSION_DEV: 6.1.3
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 1
ROCM_VERSION_DEV_PATCH: 3
ROCM_VERSION_DEV_INT: 60103
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 1
TORCH_HIP_VERSION: 601
***** Library versions from dpkg *****
rocm-developer-tools VERSION: 6.1.3.60103-122~22.04
rocm-device-libs VERSION: 1.0.0.60103-122~22.04
hsakmt-roct-dev VERSION: 20240125.5.08.60103-122~22.04
hsa-rocr-dev VERSION: 1.13.0.60103-122~22.04
***** Library versions from cmake find_package *****
hip VERSION: 6.1.40093
hsa-runtime64 VERSION: 1.13.60103
amd_comgr VERSION: 2.7.0
rocrand VERSION: 3.0.1
hiprand VERSION: 2.10.16
rocblas VERSION: 4.1.2
hipblas VERSION: 2.1.0
hipblaslt VERSION: 0.7.0
miopen VERSION: 3.1.0
hipfft VERSION: 1.0.14
hipsparse VERSION: 3.0.1
rccl VERSION: 2.18.6
rocprim VERSION: 3.1.0
hipcub VERSION: 3.1.0
rocthrust VERSION: 3.0.1
hipsolver VERSION: 2.1.1
CMake Deprecation Warning at /opt/rocm/lib/cmake/hiprtc/hiprtc-config.cmake:21 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:36 (find_package)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:168 (find_package_and_print_version)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:74 (include)
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:67 (find_package)
hiprtc VERSION: 6.1.40093
HIP is using new type enums
CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
CMakeLists.txt:67 (find_package)
-- Enabling core extension.
-- HIP supported arches: gfx906;gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100
-- HIP target arches: gfx1100
-- Enabling C extension.
-- Enabling moe extension.
-- Configuring done (6.9s)
-- Generating done (0.4s)
-- Build files have been written to: /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10
[1/19] Running hipify on _C extension source files.
FAILED: CMakeFiles/hipify_C kernels/cache_kernels.hip kernels/attention/attention_kernels.hip kernels/pos_encoding_kernels.hip kernels/activation_kernels.hip kernels/layernorm_kernels.hip kernels/quantization/squeezellm/quant_hip_kernel.hip kernels/quantization/gptq/q_gemm.hip kernels/quantization/compressed_tensors/int8_quant_kernels.hip kernels/quantization/fp8/common.hip kernels/hip_utils_kernels.hip kernels/moe/align_block_size_kernel.hip kernels/prepare_inputs/advance_step.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/CMakeFiles/hipify_C /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/cache_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/attention/attention_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/pos_encoding_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/activation_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/layernorm_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/quantization/squeezellm/quant_hip_kernel.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/quantization/gptq/q_gemm.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/quantization/compressed_tensors/int8_quant_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/quantization/fp8/common.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/hip_utils_kernels.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/moe/align_block_size_kernel.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/prepare_inputs/advance_step.hip
cd /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10 && /mnt/c/WSL2-Distros/aphrodite-engine/cmake/hipify.py -p /mnt/c/WSL2-Distros/aphrodite-engine/kernels -o /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels kernels/cache_kernels.cu kernels/attention/attention_kernels.cu kernels/pos_encoding_kernels.cu kernels/activation_kernels.cu kernels/layernorm_kernels.cu kernels/quantization/squeezellm/quant_cuda_kernel.cu kernels/quantization/gptq/q_gemm.cu kernels/quantization/compressed_tensors/int8_quant_kernels.cu kernels/quantization/fp8/common.cu kernels/cuda_utils_kernels.cu kernels/moe/align_block_size_kernel.cu kernels/prepare_inputs/advance_step.cu
/usr/bin/env: ‘python3\r’: No such file or directory
[2/19] Running hipify on _moe_C extension source files.
FAILED: CMakeFiles/hipify_moe_C kernels/moe/softmax.hip /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/CMakeFiles/hipify_moe_C /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels/moe/softmax.hip
cd /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10 && /mnt/c/WSL2-Distros/aphrodite-engine/cmake/hipify.py -p /mnt/c/WSL2-Distros/aphrodite-engine/kernels -o /mnt/c/WSL2-Distros/aphrodite-engine/build/temp.linux-x86_64-3.10/kernels kernels/moe/softmax.cu
/usr/bin/env: ‘python3\r’: No such file or directory
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/mnt/c/WSL2-Distros/aphrodite-engine/setup.py", line 460, in <module>
setup(
File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 34, in run
self.install_for_development()
File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 114, in install_for_development
self.run_command('build_ext')
File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/mnt/c/WSL2-Distros/aphrodite-engine/setup.py", line 222, in build_extensions
subprocess.check_call(["cmake", *build_args], cwd=self.build_temp)
File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '-j=32', '--target=_core_C', '--target=_moe_C', '--target=_C']' returned non-zero exit status 1.
Your current environment
python env.py
A module that was compiled using NumPy 1.x cannot be run in NumPy 2.1.2 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to downgrade to 'numpy<2' or try to upgrade the affected module. We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_n umpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708
OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.30.4 Libc version: glibc-2.35
Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 11.5.119 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 6.1.40093 MIOpen runtime version: 3.1.0 Is XNNPACK available: True
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 7950X3D 16-Core Processor CPU family: 25 Model: 97 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 2 BogoMIPS: 8399.84 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc re p_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misal ignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx 512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm Virtualization: AMD-V Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 512 KiB (16 instances) L1i cache: 512 KiB (16 instances) L2 cache: 16 MiB (16 instances) L3 cache: 96 MiB (1 instance) Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Mitigation; safe RET Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected
Versions of relevant libraries: [pip3] numpy==2.1.2 [pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44 [pip3] torch==2.1.2+rocm6.1.3 [pip3] torchvision==0.16.1+rocm6.1.3 [conda] Could not collect ROCM Version: 6.1.40093-bd86f1708 Neuron SDK Version: N/A Aphrodite Version: N/A Aphrodite Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect root@SORANET:/home/sora/aphrodite-engine# sudo python env.py
A module that was compiled using NumPy 1.x cannot be run in NumPy 2.1.2 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to downgrade to 'numpy<2' or try to upgrade the affected module. We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708
OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.30.4 Libc version: glibc-2.35
Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 11.5.119 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 6.1.40093 MIOpen runtime version: 3.1.0 Is XNNPACK available: True
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 7950X3D 16-Core Processor CPU family: 25 Model: 97 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 2 BogoMIPS: 8399.84 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm Virtualization: AMD-V Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 512 KiB (16 instances) L1i cache: 512 KiB (16 instances) L2 cache: 16 MiB (16 instances) L3 cache: 96 MiB (1 instance) Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec rstack overflow: Mitigation; safe RET Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected
Versions of relevant libraries: [pip3] numpy==2.1.2 [pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44 [pip3] torch==2.1.2+rocm6.1.3 [pip3] torchvision==0.16.1+rocm6.1.3 [conda] Could not collect ROCM Version: 6.1.40093-bd86f1708 Neuron SDK Version: N/A Aphrodite Version: N/A Aphrodite Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: Could not collect
How did you install Aphrodite?
sudo apt update wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb sudo apt install ./amdgpu-install_6.1.60103-1_all.deb
sudo amdgpu-install --list-usecase
If --usecase option is not present, the default selection is "dkms,graphics,opencl,hip" Available use cases: dkms (to only install the kernel mode driver)
rocminfo
HSA System Attributes
Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED DMAbuf Support: NO
========== HSA Agents
Agent 1
Name: CPU Uuid: CPU-XX Marketing Name: CPU Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: Chip ID: 0(0x0) Cacheline Size: 64(0x40) Internal Node ID: 0 Compute Unit: 32 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 49137460(0x2edc734) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 49137460(0x2edc734) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info:
Agent 2
Name: gfx1100 Marketing Name: AMD Radeon RX 7900 XTX Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 16(0x10) Queue Min Size: 4096(0x1000) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 32(0x20) KB L2: 6144(0x1800) KB L3: 98304(0x18000) KB Chip ID: 29772(0x744c) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 2526 Internal Node ID: 1 Compute Unit: 96 SIMDs per CU: 2 Shader Engines: 6 Shader Arrs. per Eng.: 2 Coherent Host Access: FALSE Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Packet Processor uCode:: 2280 SDMA engine uCode:: 21 IOMMU Support:: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 25086124(0x17ec8ac) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:2048KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Recommended Granule:0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1100 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 Done
build log
rocm_gfx1100_wsl2.txt