solo-io / unik

The Unikernel & MicroVM Compilation and Deployment Platform
Apache License 2.0
2.72k stars 191 forks source link

python3.5 unikernel with boto3 #66

Closed andreashappe closed 8 years ago

andreashappe commented 8 years ago

I am not 100% sure if I'm doing the packaging wrong or if my experiences behaviour is "as to be expected". I've created a simple python program acting as a HTTP server and utilizing the boto3 library to access a S3 bucket. This works as expected with python3.5 on my local computer.

I can build and deploy the unikernel on Amazon EC2.

Problem is, that I cannot execute the program. The exception (caught by my python exception handler as it wasn't displayed within the EC2 image's system output on the AWS console) shows an No module named multiprocessing exception during boto3 initialization. Does this mean that the python rumprun unikernel does not support multiple processes/threads and thus does not implement multiprocessing or have i packaged my kernel wrong? (so I'm also not sure where to ask --- unik or rumprun).

A "normal" deployment (without s3) works as expected! thanks for your work

ilackarms commented 8 years ago

are you on the current version of master?i believe this should have been fixed in bcc5a0b

did you install dependencies using pip? if you did, did you install them into the local bin/ directory in your project with pip install --install-option="--prefix=<PATH_TO_PROJECT_ROOT>" <MODULE_NAME> ?

andreashappe commented 8 years ago

unik was cloned with 78edde2463be5ec41caaa68ea48d90d868475dd0, which is after bcc5a0b

I installed boto3 this way, yes. The whole python project can now be accessed at https://github.com/andreashappe/unik-python-aws , hope that helps!

ilackarms commented 8 years ago

thank you for sharing the project. i was able to compile it and it worked on the most recent master image built and run simply with:

unik build --path ./unik-python-aws/ --compiler rump-python-virtualbox --provider virtualbox --name vpy --force  && \
unik run --instanceName vpy --imageName vpy

screen shot 2016-08-08 at 4 22 46 pm

and everything appears to be in order

on a separate note, i would not expect Python or Go unikernels running on AWS to work currently, as there is a known bug with rumprun that we're currently working out. Virtualbox should be just fine, however

andreashappe commented 8 years ago

cool.

so I'd propose to keep this issue open until I can get my virtualbox installation running (to verify that)? Do you have any ETA about python working with aws? It actually did work with a simple program -- until I hit that multiprocessing exception.

ilackarms commented 8 years ago

@uvgroovy is working on it currently and hopefully it will be resolved within the week. It's not our fault! AWS changed something on their end that is causing rump to be unhappy

feel free to leave this open, I'd like to understand better what is going on. You may find it beneficial to play with the PYTHONPATH env variable. By default, PYTHONPATH is set to

"PYTHONPATH=/bootpart/lib/python3.5/site-packages/:/bootpart/bin/"

(see rump_scripting.go)

This should work (since your dependencies are in /bootpart/bin) but try playing around with this. You can add multiple directories onto the pythonpath with :

All the files in your project directory will live in the unikernel filesystem at /bootpart, hence why /bootpart/bin and /bootpart/lib/python3.5/site-packages/ are in the PYTHONPATH.

You can overwrite the PYTHONPATH by setting it with the --env flag when you launch an instance

andreashappe commented 8 years ago

@ilackarms : just tested it again against current master. The error occurs during the S3 initialization routine -- which is only called if a correct aws secret was given. If no secret was given the output looks like yours (i.e. KeyError), if a secret was given I get the multiprocessing error.

My command line is curl http://52.53.219.29:8080/\?secret\=3\&aws_secret_access_key\=xxx\&aws_access_key_id\=xxx\&bucket\=archistar-01\&keys\=newfile.txt

ilackarms commented 8 years ago

after doing a bit of digging, it looks like _multiprocessing package lives at /lib/python3.5/lib-dynload/_multiprocessing.so. since it's a dynamic library (.so), it can't work in a unikernel, and thus not rumprun! unikernels by definition disable dynamically linking libraries.

furthermore, by its name multiprocessing implies some kind of multi-threading by fork()ing. Unikernels also only support single-process execution, so fork() wouldn't work even if the library was statically linked.

In short, it looks like the boto3 lib won't be compatible with rumprun / unikernels in general.

andreashappe commented 8 years ago

yes, the multi-processing part is what I originally thought. Can't this be detected during compile time and a warning or failure be produced?

ilackarms commented 8 years ago

if python was a compiled language, that might be easier to do. however, as you can see, this error only occurs when a particular code path is executed. there might be some dynamic code analysis tools available that make this possible. in theory i could write the following code in python

if False:
     import library_that_does_not_exist

and the program would never fail to execute.

ilackarms commented 8 years ago

just a note, @andreashappe: @uvgroovy fixed the Go/Python bug on AWS in 43aa244

andreashappe commented 8 years ago

@ilackarms thanks for the headsup. I still have to switch to another python library for my testcase, but will try again!

andreashappe commented 8 years ago

@ilackarms I wanted to try the aws in combination with python (and another library, this time) but it seems that rump-python-aws is not supported anymore?

ERRO[0000] build failed: [cmd/build.go:94] building image failed: {[client/images.go:55] failed with status 400: [daemon/daemon.go:370] provider aws does not support compiler rump-python-aws; supported compilers: rump-go-xen|rump-nodejs-xen|rump-python-xen|osv-java-xen}

I really would like to test it with another library, so that the issue would offer a partial solution for the original problem (boto3 not working).

ilackarms commented 8 years ago

The name was changed to rump-python-xen since we now support the xen provider, and they use the same compiler