mc2-project / secure-xgboost

Secure collaborative training and inference for XGBoost.
https://mc2-project.github.io/secure-xgboost/
Apache License 2.0
105 stars 32 forks source link

Some questions about the use of enclave #161

Open DylanWangWQF opened 3 years ago

DylanWangWQF commented 3 years ago

Hi, everyone! I have a question about the operations inside the enclave.

As far as I know, OpenEnclave currently does not support fstream inside the enclave. So how should we load the file content inside the enclave? (C/C++). And for other functions which are not supported in OpenEnclave? Are there some Docs to introduce these codes?

BTW, are there any Docs to introduce the use of Oblivious Primitives such as oassign(), osort() in secure-XGBoost?

Thank you in advance!

chester-leung commented 3 years ago

Hi @DylanWangWQF, thanks for your interest in our project!

Regarding loading files inside the enclave, we load the host file system module when initializing an enclave, which enables us to call functions like fopen and fread within the enclave.

For some more information on the oblivious primitives we use, please see section 6 of our paper.

DylanWangWQF commented 3 years ago

Many thanks for your help! @chester-leung I will study this and figure out how it works.

DylanWangWQF commented 3 years ago

Hi @chester-leung , Sry for reopening this issue, I have another question.

Now, I want to link the NTL and GMP library inside the enclave. Could you tell me how do you use other libraries inside the enclave in this project?

For the AVX2 instructions in this project (#include <intrinsics/immintrin.h>), can we use it directly inside the enclave? Since I didn't find related docs to introduce it, I'm confusing to understand the mechanism by reading the code.

Thank you in advance!

chester-leung commented 3 years ago

@DylanWangWQF in general, if you want to link other libraries into the enclave, you can link them in CMakeLists.txt. See here for an example of how we linked an external library (spdlog) for usage within the enclave. Some libraries may not be completely compatible to work inside the enclave, e.g. some syscalls aren't supported, so you may have to go in and remove/modify those functions if you don't need them.

My colleague @podcastinator will answer your question about AVX2 instructions.

What is your use case here? Maybe we can work with you to build something if you're willing to contribute back to the open source.

podcastinator commented 3 years ago

Hi @DylanWangWQF, yes, you can use the AVX2 instructions directly inside the enclave. Please see https://github.com/mc2-project/secure-xgboost/blob/master/include/enclave/obl_primitives.h for examples. (You will need to set the CMake flag USE_AVX2 to enable the use of AVX instructions inside the enclave -- this causes the enclave target to be compiled with the requisite -mavx2 compiler flag)

DylanWangWQF commented 3 years ago

@podcastinator Many thanks for your reply! I will study the use of AVX2 in this project.

@chester-leung Currently, I'm working on the combination of HElib and OE. As you said, I have to link other libraries and consider the compatibility inside the enclave.

ryanleh commented 3 years ago

@DylanWangWQF Out of curiosity, if you can tell, why are you using homomorphic encryption inside of an enclave? I haven't seen such a use-case before.

DylanWangWQF commented 3 years ago

@ryanleh I'd like to do some pre-computation based on HE outside the enclave. Then decrypt it inside the enclave.

May I ask how can I use the module oe_load_module_host_file_system in my sample? Where is the scource edl api?

podcastinator commented 3 years ago

@DylanWangWQF See here for more info on oe_load_module_host_file_system -- https://github.com/openenclave/openenclave/blob/master/docs/UsingTheIOSubsystem.md This function is provided by the OpenEnclave SDK and is needed for loading the requisite IO modules

DylanWangWQF commented 3 years ago

Got it! Thank you! @podcastinator here should be stack pages https://github.com/mc2-project/secure-xgboost/blob/96eb1a9a985135566b720da220a1fb20ef978ee4/CMakeLists.txt#L30

DylanWangWQF commented 3 years ago

@chester-leung I found the heap page is around 390MB in the setting. https://github.com/mc2-project/secure-xgboost/blob/96eb1a9a985135566b720da220a1fb20ef978ee4/CMakeLists.txt#L27

This is much larger than what existing SGX1 hardware can provide. And in your paper, it's 112Mb enclave page cache. So is this available in the project for the large program?

BTW, I'm wondering how does the enclave process large datasets, e.g, much large than 128MB?

DylanWangWQF commented 3 years ago

I propose this question is because the enclave image is too big after compiling inside the enclave.

drwxrwxr-x 3 dylan dylan 4.0K Jun 30 11:24 CMakeFiles/
-rw-rw-r-- 1 dylan dylan 1.2K Jun 30 11:24 cmake_install.cmake
-rwxrwxr-x 1 dylan dylan  66M Jun 30 11:24 enclave*
-rw-rw-r-- 1 dylan dylan  66M Jun 30 11:24 enclave.signed
-rw-rw-r-- 1 dylan dylan 3.3K Jun 30 11:24 heenclave_args.h
-rw-rw-r-- 1 dylan dylan 332K Jun 30 11:24 heenclave_t.c
-rw-rw-r-- 1 dylan dylan  13K Jun 30 11:24 heenclave_t.h
-rw-rw-r-- 1 dylan dylan 7.3K Jun 30 11:24 Makefile
podcastinator commented 3 years ago

@DylanWangWQF the 112MB limit only applies to the physical memory (i.e., the portion of the RAM dedicated to the EPC). This does not restrict the amount of virtual memory available to the enclave application, and the selected heap size determines the amount of virtual memory that the app has its disposal.