mofanv / darknetz

runs several layers of a deep learning model in TrustZone
MIT License
86 stars 29 forks source link

RPi3 no enough memory error #2

Closed PeterVanNostrand closed 4 years ago

PeterVanNostrand commented 5 years ago

Hello @mofanv, I have been attempting to run darknetp on a raspberry pi with OP-TEE but keep encountering this error:

tee_ta_open_session:588 init session failed 0xffff000c

This seems to indicate that OP-TEE could not allocate enough memory for the TA. Judging from the documentation, this seems to be because I need to "adjust TA’s MMU L1 table". I made sure that I followed the instructions from your readme, but didn't see anything about modyfing any OPTEE_OS code. Did you find a solution for this? I raised an issue with OP-TEE hoping to get guidelines on how to allocate more memory for the TA. Any insights you could provide would be appreciated.

mofanv commented 5 years ago

Hi @PeterVanNostrand ,

The required TA memory is 10 MB for the darknetp. Could you please tell me which model of RPi3 you are using? The RPi3 model B has a maximum of 16 MiB TZ memory, so it does not need you to change the MMU table.

If your board doesn't have enough TZ memory, you can try to reduce the required TA memory of darknetp by doing:

  1. open darknetp/ta/include/user_ta_header_defines.h

  2. reduce the size of #define TA_STACK_SIZE (1 * 1024 * 1024) #define TA_DATA_SIZE (10 * 1024 * 1024)

  3. recompile darknetp and run it again.

Let me know if this problem still happens.

PeterVanNostrand commented 5 years ago

@mofanv Thanks for your reply! I am indeed using the Raspberry Pi 3B. I see what you're referring to regarding the 16MiB of TZ RAM from here, however I haven't been able to get the DarkNetP TA running without making changes. Yesterday I dropped the TA_DATA_SIZE and TA_STACK_SIZE to be ~4MiB total and was able to run the TA, but I expect that this will be insufficient memory for some models. What values are you using for PGT_CACHE_SIZE and CFG_NUM_THREADS? For some reason it seems that the TA doesn't actually have access to all 16MiB of that memory so I wonder if the memory is being split up among the threads, or if you made any configuration changes. Thanks again for your help.

mofanv commented 5 years ago

For the RPi3, I didn't change the config, but I did change some configs when I was testing it on QEMU. Please have a try using this config for QEMU.

I'm out until next week, so at that time I can access my computer to check the exact values of PGT_CACHE_SIZE etc.

PeterVanNostrand commented 5 years ago

Thanks for your help. I'll continue to tweak my configuration and see what I can get running. The issue I opened with OP-TEE indicated that CFG_TEE_RAM_VA_SIZE might also be important so I'll be interested to hear what you turn up next week. Have a good weekend!

mofanv commented 5 years ago

Hi @PeterVanNostrand , sorry for the late reply. Are you still encountering the memory problem?

My configure files are:

  1. For the optee_os/core/arch/arm/conf.mk
    
    include core/arch/arm/cpu/cortex-armv8-0.mk

$(call force,CFG_TEE_CORE_NB_CORE,4) $(call force,CFG_8250_UART,y) $(call force,CFG_GENERIC_BOOT,y) $(call force,CFG_PM_STUBS,y) $(call force,CFG_SECURE_TIME_SOURCE_CNTPCT,y) $(call force,CFG_WITH_ARM_TRUSTED_FW,y)

ta-targets = ta_arm32

ifeq ($(CFG_ARM64_core),y) $(call force,CFG_WITH_LPAE,y) ta-targets += ta_arm64 else $(call force,CFG_ARM32_core,y) endif

CFG_NUM_THREADS ?= 4 CFG_CRYPTO_WITH_CE ?= n CFG_WITH_STACK_CANARIES ?= y

CFG_TEE_CORE_EMBED_INTERNAL_TESTS ?= y CFG_WITH_STACK_CANARIES ?= y CFG_WITH_STATS ?= y

arm32-platform-cflags += -Wno-error=cast-align arm64-platform-cflags += -Wno-error=cast-align

$(call force,CFG_CRYPTO_SHA256_ARM32_CE,n) $(call force,CFG_CRYPTO_SHA256_ARM64_CE,n) $(call force,CFG_CRYPTO_SHA1_ARM32_CE,n) $(call force,CFG_CRYPTO_SHA1_ARM64_CE,n) $(call force,CFG_CRYPTO_AES_ARM64_CE,n)


2. For the `optee_os/core/arch/arm/platform_config.h`

ifndef PLATFORM_CONFIG_H

define PLATFORM_CONFIG_H

/ Make stacks aligned to data cache line length /

define STACK_ALIGNMENT 64

ifdef ARM64

ifdef CFG_WITH_PAGER

error "Pager not supported for ARM64"

endif

endif / ARM64 /

/ 16550 UART /

define CONSOLE_UART_BASE 0x3f215040 / UART0 /

define CONSOLE_BAUDRATE 115200

define CONSOLE_UART_CLK_IN_HZ 19200000

/*

define DRAM0_BASE 0x00000000

define DRAM0_SIZE 0x40000000

/ Below ARM-TF /

define TEE_SHMEM_START (0x08000000)

define TEE_SHMEM_SIZE (4 1024 1024)

define TZDRAM_BASE (TEE_SHMEM_START + TEE_SHMEM_SIZE)

define TZDRAM_SIZE (32 1024 1024)

define TEE_RAM_VA_SIZE (4 1024 1024)

define TEE_LOAD_ADDR (TZDRAM_BASE + 0x20000)

define TEE_RAM_PH_SIZE TEE_RAM_VA_SIZE

define TEE_RAM_START TZDRAM_BASE

define TA_RAM_START ROUNDUP((TZDRAM_BASE + TEE_RAM_VA_SIZE), \

                CORE_MMU_DEVICE_SIZE)

define TA_RAM_SIZE (16 1024 1024)

endif / PLATFORM_CONFIG_H /



Help this can help you, please let me know if the error still happens.
xiaxinkai commented 5 years ago

@PeterVanNostrand Did you solve this issue? I met the same proble.

Yesterday I dropped the TA_DATA_SIZE and TA_STACK_SIZE to be ~4MiB total and was able to run the TA,

May I know you value TA_DATA_SIZE and TA_STACK_SIZE ?

Thank you!

PeterVanNostrand commented 5 years ago

@xiaxinkai I was able to get it running with up to 7MB of TA memory total, I believe it was 500KB of stack and 6.5MB of heap. Anything larger caused a TEE_TARGET_DEAD error from OP-TEE.

It's probably possible to extend this by reconfiguring the page table assignments in OP-TEE, but I was unable to figure this out. If your interested in trying you can start with my issue on the OP-TEE repo

xiaxinkai commented 5 years ago

@PeterVanNostrand Thank you for your information, I will try it.

xiaxinkai commented 5 years ago

@PeterVanNostrand @mofanv I'm very happy to share with you : I solve this issue. Now I Can use as large as 60M for TA of TrustZone.

I use Raspberry Pi 3B.

My Code Version: optee_client version: 3.5.0 optee_os version: 3.5.0 (v3.6.0 needs python3 for compiling, so I can not use v3.6.0 now) optee_examples version: latest darknetp version: latest


Following is my changes:

(1)optee_os\mk\config.mk
CFG_TEE_TA_LOG_LEVEL ?= 4
CFG_TEE_TA_MALLOC_DEBUG ?= y
(2)optee_os\core\arch\arm\plat-rpi3\conf.mk
CFG_TZDRAM_SIZE ?= 0x04000000
CFG_TEE_RAM_VA_SIZE ?= 0x00200000
(3)optee_os\core\arch\arm\include\mm\pgt_cache.h
#ifdef CFG_WITH_PAGER
#if CFG_NUM_THREADS < 2
#define PGT_CACHE_SIZE  4
#else
#define PGT_CACHE_SIZE  ROUNDUP(CFG_NUM_THREADS * 2, PGT_NUM_PGT_PER_PAGE)
#endif
#else
#define PGT_CACHE_SIZE  32
#endif /*CFG_WITH_PAGER*/
(4)optee_examples\darknetp\ta\include\user_ta_header_defines.h
#define TA_STACK_SIZE           (1 * 1024 * 1024)
#define TA_DATA_SIZE            (60 * 1024 * 1024)

Following is my test log:

@nutshell:/data/tz_datasets # pwd
/data/tz_datasets
@nutshell:/data/tz_datasets # ls -al
total 48
drwxrwxrwx  5 0    1007 4096 1970-01-01 00:27 .
drwxrwxrwx 22 1000 1000 4096 1970-01-01 00:00 ..
drwxrwxrwx  2 0    1007 4096 1970-01-01 00:27 cfg
drwxrwxrwx  3 0    1007 4096 1970-01-01 00:27 data
drwxrwxrwx  3 0    1007 4096 1970-01-01 00:27 models
@nutshell:/data/tz_datasets # optee_example_hello_world
D/TA:  TA_CreateEntryPoint:39 has been called
D/TA:  TA_OpenSessionEntryPoint:68 has been called
I/TA: Hello World!
Invoking TA to increment 42
D/TA:  inc_value:105 has been called
D/TA:  inc_value:110 Memory Test Start
D/TA:  inc_value:113 Memory Test 59M Start
D/TA:  inc_value:119 Memory Test 59M Success
D/TA:  inc_value:121 Memory Test 59.5M Start
D/TA:  inc_value:127 Memory Test 59.5M Success
D/TA:  inc_value:129 Memory Test 59.75M Start
D/TA:  inc_value:135 Memory Test 59.75M Success
D/TA:  inc_value:137 Memory Test 60M Start
optee_example_hello_world: TEEC_InvokeCommand failed witI/TA: Goodbye!
h code 0D/TA:  TA_DestroyEntryPoint:50 has been called
xffff000c origin 0x4
1|@nutshell:/data/tz_datasets # optee_example_darknetp classifier predict -pp classifier predict -pp                                                        <4 cfg/mnist.dataset cfg/mnist_lenet.cfg models/mnist/mnset cfg/mnist_lenet.cfg models/mnist/mn                                       <ist_lenet.weights data/mnist/images/t_0ist_lenet.weights data/mnist/images/t_0                                       <0007_c3.png
Prepare session with the TA
D/TA:  TA_CreateEntryPoint:28 has been called
D/TA:  TA_OpenSessionEntryPoint:47 has been called
I/TA: I'm Vincent, from secure world!
Begin darknet
layer     filters    size              input                output
    0 conv      6  5 x 5 / 1    28 x  28 x   3   ->    28 x  28 x   6  0.001 BFLOPs
    1 max          2 x 2 / 2    28 x  28 x   6   ->    14 x  14 x   6
    2 conv      6  5 x 5 / 1    14 x  14 x   6   ->    14 x  14 x   6  0.000 BFLOPs
    3 max          2 x 2 / 2    14 x  14 x   6   ->     7 x   7 x   6
    4 connected_TA                          294  ->   120
    5 dropout_TA    p = 0.80                120  ->   120
    6 connected_TA                          120  ->    84
    7 dropout_TA    p = 0.80                 84  ->    84
    8 connected_TA                           84  ->    10
    9 softmax_TA                                       10
   10 cost_TA                                          10
Loading weights from models/mnist/mnist_lenet.weights...Done!
data/mnist/images/t_00007_c3.png: Predicted in 0.085040 I/TA: Goodbye!
seconds.D/TA:  TA_DestroyEntryPoint:35 has been called

100.00%: 3
 0.00%: 1
 0.00%: 2
 0.00%: 0
 0.00%: 4
user CPU start: 0.099787; end: 0.099787
kernel CPU start: 3.571597; end: 3.571883
Max: 2828  kilobytes
vmsize:545460850712; vmrss:545460849420; vmdata:545460847468; vmstk:545460846724; vmexe:488; vmlib:2340
@nutshell:/data/tz_datasets # 

Attachment is the code and log. Enjoy it. optee_truszone_64M_20190910_xiaxinkai.zip

mofanv commented 5 years ago

@xiaxinkai sorry for the very late response and many thanks for your work!