ROCm / hipBLASLt

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html
MIT License
40 stars 57 forks source link

[Issue]: BrokenPipeError: [Errno 32] Broken pipe #854

Closed unclemusclez closed 1 week ago

unclemusclez commented 1 week ago

Problem Description

Reading logic files: Launching 16 threads...

Killed
make[2]: *** [library/CMakeFiles/TENSILE_LIBRARY_TARGET.dir/build.make:74: Tensile/library/TensileManifest.txt] Error 137
make[2]: Leaving directory '/home/musclez/hipBLASLt/build/release'
make[1]: *** [CMakeFiles/Makefile2:284: library/CMakeFiles/TENSILE_LIBRARY_TARGET.dir/all] Error 2
make[1]: Leaving directory '/home/musclez/hipBLASLt/build/release'
make: *** [Makefile:166: all] Error 2

Process ForkPoolWorker-8:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker
    put((job, i, result))
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put
    self._writer.send_bytes(obj)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes
    self._send(header)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

Operating System

WSL2 Ubuntu 22.04 Windows 11

CPU

7800x3D

GPU

AMD Radeon RX 7900 XT

Other

ROCm 6.1.3

ROCm Version

ROCm 5.7.1

ROCm Component

hipBLASLt

Steps to Reproduce

./install.sh -idc

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

===================== HSA System Attributes

Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED DMAbuf Support: NO

========== HSA Agents


Agent 1


Name: CPU Uuid: CPU-XX Marketing Name: CPU Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: Chip ID: 0(0x0) Cacheline Size: 64(0x40) Internal Node ID: 0 Compute Unit: 16 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 32428304(0x1eed110) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 32428304(0x1eed110) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info:


Agent 2


Name: gfx1100 Marketing Name: AMD Radeon RX 7900 XT Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 16(0x10) Queue Min Size: 4096(0x1000) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 32(0x20) KB L2: 6144(0x1800) KB L3: 81920(0x14000) KB Chip ID: 29772(0x744c) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 2219 Internal Node ID: 1 Compute Unit: 84 SIMDs per CU: 2 Shader Engines: 6 Shader Arrs. per Eng.: 2 Coherent Host Access: FALSE Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Packet Processor uCode:: 2250 SDMA engine uCode:: 20 IOMMU Support:: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 20906152(0x13f00a8) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Recommended Granule:2048KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Recommended Granule:0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1100 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 Done

Additional Information

No response

unclemusclez commented 1 week ago

adding --architecture 'gfx1100' ie: ./install.sh -idc --architecture 'gfx1100'