Closed gilv closed 5 years ago
@gilv That was only added recently without a PR, and I don't think it should have been added.
The metaspace2020
package is only used by the notebooks for validation & visualization. It's not used at all by the Functions that do processing in IBM Cloud. It shouldn't be needed in the runtime.
Is there any way to stop PyWren from trying to include these dependencies? I think this is also related to the issue that @omerb01 raised: https://github.com/metaspace2020/pywren-annotation-pipeline/issues/24 where PyWren tries to include Jupyter.
@LachlanStuart @gilv
I did some tests and found that PyWren serialise meatspace module because of this line:
https://github.com/metaspace2020/pywren-annotation-pipeline/blob/20e9db6335a229452d599527922a5fa3de68e914/annotation_pipeline/check_results.py#L7
because it needs to serialise annotation_pipeline
module which uses metaspace
inside.
PyWren map function has exclude_modules
parameter to avoid serialising by force. I can suggest to add for every map function a list of modules by config.json
to do so. how does it sound?
@omerb01 Are you suggesting a different list of exclude_modules
for each function, or just one list for all functions? I would prefer one list for all functions for simplicity, if it's possible.
It's a shame that PyWren doesn't have the opposite logic - an include_modules
list would be shorter than an exclude_modules
list, and it wouldn't need to be updated every time some new visualization library(or similar) is added to the notebooks.
Regarding where to put it - I would suggest adding keeping the list as a constant variable somewhere in code so that it's automatically synchronized via git. config.json
isn't checked into git, so it requires people to manually update it, which means people would occasionally hit issues if they forget to do it.
@LachlanStuart yes, I suggest just one list for all functions.
include_modules
parameter suggestion sounds great, will discuss about it PyWren side.
@gilv @omerb01 @LachlanStuart I recently added a new Dockerfile which generates a slim image for pywren. It only takes 307MB compared to 1,2GB of the default image ibmfunctions/action-python-v3.6
. So I suggest to test that Dockerfile, including all your needed packages, if you want to reduce final image size.
@omerb01 Just out of curios: What was the size of he old image and what is the size of the current image by using the Dockerfile.slim-python36?
@JosepSampe 342.12 MB for the old one 140.48 MB for the new slim I took this info from my Docker account and it includes the modules for annotation-pipeline project
due to #45 we can close this
PyWren need annotation package in order to run our demo notebooks. So far we use Docker image that includes
This command install annotation package. However this also install all kind of unnecessary packages, like Matplotlib , etc. which is about 150MB. We need to find a way how to add annotation runtime and avoid installing Matplotlib.