Xilinx / Vitis_Embedded_Platform_Source

115 stars 67 forks source link

zc706 platform bug: could not allocate cl buffer larger than 8MByte #12

Closed doonny closed 4 years ago

doonny commented 4 years ago

I am testing the cl burst_rw example form the Vitis_Accel_example repo on zc706. The program could not allocate global buffer with size larger than 8Mbyte. For instance, when changing

#define DATA_SIZE 2048

to the size of 4096*1024, the program failes with the following error:

Found Platform
Platform Name: Xilinx
INFO: Reading vadd.xclbin
Loading: 'vadd.xclbin'
Trying to program device[0]: edge
Device[0]: program successful!
XRT build version: 2.6.0
Build hash: 2d6bfe4ce91051d4e5b499d38fc493586dd4859a
Build date: 2020-05-28 12:55:59
Git branch: 2020.1
PID: 587
UID: 0
[Fri May 29 12:03:50 2020]
HOST: zynq-rootfs-common-2020_1
EXE: /media/sd-mmcblk0p1/cl_burst_rw/host
[XRT] ERROR: std::bad_alloc
src/host.cpp:101 Error calling cl::Buffer buffer_rw( context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, vector_size_bytes, source_inout.data(), &err), error code is: -6
[XRT] WARNING: Profiling may contain incomplete information. Please ensure all OpenCL objects are released by your host code (e.g., clReleaseProgram()).
ERROR: host run failed, RC=1
INFO: host run completed.

This exception code -6 generally means out of memoy. The same program works fine on zcu102 board.

I have also tried the the xbutil query command, and the bsp info are as follow:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
System Configuration
OS name:        Linux
Release:        5.4.0-xilinx-v2020.1
Version:        #1 SMP PREEMPT Thu May 28 12:56:35 UTC 2020
Machine:        armv7l
Glibc:          2.30
Distribution:   N/A
Now:            Fri May 29 12:08:24 2020
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
XRT Information
Version:        2.6.0
Git Hash:       2d6bfe4ce91051d4e5b499d38fc493586dd4859a
Git Branch:     2020.1
Build Date:     2020-05-28 12:55:59
ZOCL:           2018.2.1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Shell                           FPGA                            IDCode
edge                            N/A                             N/A
Vendor          Device          SubDevice       SubVendor
0x10ee          N/A             N/A             N/A
**DDR size        DDR count       Clock0          Clock1          Clock2
0 Byte          1               100             0               0**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Memory Status
     Tag         Type        Temp(C)  Size    Mem Usage       BO count
[ 0] HP0         MEM_DRAM             1 GB    0 Byte          24
[ 1] HP1         **UNUSED**           0 Byte  0 Byte          24
[ 2] HP2         **UNUSED**           0 Byte  0 Byte          24
[ 3] HP3         **UNUSED**           0 Byte  0 Byte          24
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Streams
     Tag         Flow ID  Route ID Status   Total (B/#)     Pending (B/#)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Xclbin UUID
25fdb77d-64b2-430f-bc61-44167781caf8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compute Unit Status
CU[ 0]: vadd:vadd_1                     @0x80000000        (IDLE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
INFO: xbutil query succeeded.

Note that the "DDR size" is zero here, but on zcu102, it is 2GB.

There might be something wrong with the bsp that has restricted such smaller size of global buffer.

imrickysu commented 4 years ago

Could you check the CMA size? cat /proc/meminfo

doonny commented 4 years ago

@imricksu, Thx for the reply, the info is as follow:


MemTotal:        1028284 kB
MemFree:          978588 kB
MemAvailable:     988752 kB
Buffers:           12656 kB
Cached:            15068 kB
SwapCached:            0 kB
Active:            27156 kB
Inactive:           8080 kB
Active(anon):       7576 kB
Inactive(anon):      108 kB
Active(file):      19580 kB
Inactive(file):     7972 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:        262144 kB
HighFree:         237936 kB
LowTotal:         766140 kB
LowFree:          740652 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                84 kB
Writeback:             0 kB
AnonPages:          7520 kB
Mapped:             4512 kB
Shmem:               176 kB
KReclaimable:       2692 kB
Slab:               7444 kB
SReclaimable:       2692 kB
SUnreclaim:         4752 kB
KernelStack:         568 kB
PageTables:          292 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      514140 kB
Committed_AS:      24044 kB
VmallocTotal:     245760 kB
VmallocUsed:         468 kB
VmallocChunk:          0 kB
Percpu:              240 kB
CmaTotal:          16384 kB
CmaFree:           16096 kB

For reference, on zcu102, it is

CmaTotal:          524288 kB
CmaFree:           520812 kB
imrickysu commented 4 years ago

Yes, the CMA size is different. 16MB max CMA size may not be able to provide 8MB buffer. Please try to enlarge CMA by setting CMA size in bootargs.

There are a lot of ways to change CMA size. Here are some examples:

doonny commented 4 years ago

@imrickysu Hi, I have managed to build the petalinux flow. However, I could not find the CMA config option in the kernel config page. According to the forum posts, it is located on Device Drivers ---> Generic Driver options, however, I could not find this item here:

config

doonny commented 4 years ago

problem fixed !