islet-project / islet

An on-device confidential computing platform
Apache License 2.0
93 stars 17 forks source link

Update Docker Images and readme provided as part of confidential-ml example #356

Open ajay-fuji opened 2 months ago

ajay-fuji commented 2 months ago

Hi,

Docker image provided with examples/confidential-ml/code_model.md example is outdated as per latest code (main branch).

So the steps mentioned in code_model.md to run device on top of arm FVP does not work.

Please provide updated images and update link in document as well.

Thanks!

ajay-fuji commented 2 months ago

In the documentation examples/confidential-ml/code_model.md, in How to test with Islet section, it seems like launching of ARM FVP should come before running three instances on host PC.

Once FVP is started with tap network, then a new network interface is created with 193.168.10.15 IP. After that only certifier-service, runtime and model-provider can be run with given commands. After that commands following terminal 4 can be run.

Command to launch terminal 4 can also be added once FVP is running -

Please suggest for possible correction.

jinbpark commented 2 months ago

First of all, thank you for trying out ISLET-! you can find answers below.

Docker image provided with examples/confidential-ml/code_model.md example is outdated as per latest code

It's true and I'm aware of that. As you said, it needs to be updated accordingly. I'll check it out and update it. You can use How to test with simulated enclave (no actual hardware TEE) on x86_64 in the meantime (if this no-actual-TEE setup suffices for what you want to do).

Once FVP is started with tap network, then a new network interface is created with 193.168.10.15 IP.

It also seems to have to do with "outdated" example codes and instructinos. I'll check it out as well.

ajay-fuji commented 2 months ago

Thanks @jinbpark for checking this out. We want to run use case with islet on FVP. We tried changing scripts as per latest code and run but still getting some issues with device running on FVP.

image

Any temporary patch for this will be helpful until documentation and docker image is fixed from your end.

Thanks!

jinbpark commented 2 months ago

We tried changing scripts as per latest code and run but still getting some issues with device running on FVP.

Could you write down a more detailed context about your changes? It might help.

ajay-fuji commented 2 months ago

Files changed -

certifier-service
     |- run.sh  #-> make HOST as input argument instead of using 0.0.0.0
runtime
     |- init.sh  # Change with latest code from main branch
     |- run.sh # Change with latest code from main branch 
model-provider
     |- init.sh #Change with latest code from main branch
     |- run.sh  # Change with latest code from main branch

Steps to run example -

ajay-fuji commented 2 months ago

Do we have any idea, how to resolve the GLIBC issue?

ajay-fuji commented 2 months ago

Thanks @jinbpark for checking this out. We want to run use case with islet on FVP. We tried changing scripts as per latest code and run but still getting some issues with device running on FVP.

image

Any temporary patch for this will be helpful until documentation and docker image is fixed from your end.

Thanks!

We could not reproduce this exact error. But few things to notice is that,

PS: When we start FVP, machine internet goes down. Anyone has experienced this earlier? Any idea how to fix this?

jinbpark commented 1 month ago

PS: When we start FVP, machine internet goes down. Anyone has experienced this earlier? Any idea how to fix this?

Could you try commenting out line-41/line-42 of configure_tap.sh?

Do we have any idea, how to resolve the GLIBC issue?

Sorry about the late response. I don't have enough time to dig into this issue, until the start of this September. I'll do look at this issue after that point (maybe in the middle of this September?). In the meanwhile, you can build the tensorflow lite library on your own if you really need the tensorflow capability.

ajay-fuji commented 1 month ago

Hi @jinbpark,

Thanks for suggesting the solution. Although we were able to find out this solution. Also since that glibc issue is not reproducible, so you can skip that part for now.

Currently we are not able to run ./init_aarch64.sh and ./run_aarch64.sh with below error - image

Error for run_aarch64.sh -

# ./run_aarch64.sh 193.168.10.15 8125 code 0 -1 -1 193.168.20.10
Mon Sep 18 00:00:00 UTC 2023
ln: /lib/libtensorflowlite.so: File exists
ln: /lib/libtensorflowlite_flex.so: File exists
running as client
load_client_certs_and_key: can't translate der to X509
init_client_ssl: load_client_certs_and_key failed
Can't init client app

Here also same time-voilation error logs could be seen in certifier-service terminal.

If you help up navigate through this, it would be grateful. Thanks!