grycap / scar

Serverless Container-aware ARchitectures (e.g. Docker in AWS Lambda)
https://scar.readthedocs.io/en/latest/
Apache License 2.0
598 stars 46 forks source link

Access EFS storage from within container #360

Open WinstonN opened 4 years ago

WinstonN commented 4 years ago

Hello

With EFS available in Lambda I want to use this storage

I have setup EFS, and inside the lambda I can see it. To do this I just import the OS module, and print a listdir

# Copyright (C) GRyCAP - I3M - UPV
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Module with handler used in the lambda function."""

import faassupervisor.supervisor as supervisor
import os

def lambda_handler(event, context):
    print(os.listdir('/mnt/'))
    """Launches the supervisor and returns its output."""
    return supervisor.main(event=event, context=context)

Inside the logs I can see my "efs" storage

2020-07-14T12:03:05.757+12:00 | ['efs']

But when I send the command df -h as documented here: https://scar.readthedocs.io/en/latest/advanced_usage.html#executing-custom-commands-and-arguments

I don't see this storage, or inside /mnt

❯❯❯ python3 scar/scarcli.py run -n cypress-lambda-ubuntu df -h
Request Id: 4a5e162d-e817-4e83-a584-e812a0030a67
Log Group Name: /aws/lambda/cypress-lambda-ubuntu
Log Stream Name: 2020/07/14/[$LATEST]9918805ef7cf4ce899e4330f2fae57ba
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       526M  509M  5.9M  99% /
/dev/vdb        1.5G   20M  1.4G   2% /dev

❯❯❯ python3 scar/scarcli.py run -n cypress-lambda-ubuntu ls -la /mnt/
Request Id: bbb75834-b518-483c-b71d-de6c75eeb5c4
Log Group Name: /aws/lambda/cypress-lambda-ubuntu
Log Stream Name: 2020/07/14/[$LATEST]9538f62f4a96409caa915faeaac26f84
total 8
drwxr-xr-x  2 sbx_user1051 sbx_user1051 4096 May 26 13:40 .
drwxrwxr-x 21 sbx_user1051 sbx_user1051 4096 Jul 14 00:15 ..

Can you point me in the right direction please? Thanks!

gmolto commented 4 years ago

Hi @WinstonN. Thank you for your interest in SCAR. We are very happy to see that you are undertaking this proof of concept ahead. Unfortunately, we could not allocate time to look into this, so far. We would be delighted if you keep us posted with your findings and very happy if you contribute to SCAR with your developments.

WinstonN commented 4 years ago

Hi @gmolto

I can make a PR, no problem but here's what I see, I see in this repo: https://github.com/grycap/faas-supervisor in the extra folder, a udocker.zip I replaced this in my fork, and added the /mnt to

sysdirs_list = (
"/dev", "/proc", "/sys", "/etc/resolv.conf", "/etc/host.conf","/lib/modules",
)

but it doesn't seem to be the right place. Can you simply tell me which udocker.py file you use (the one that ends up in /opt/udocker/udocker.py in the lambda) - if you can just tell me that I can fix it and setup a PR

Thanks!

WinstonN commented 4 years ago

@gmolto I saw that there is a faas layer created. I removed it, and recreated it, but I still don't see my changes Could you please let me know, or find out from someone in your team, which udocker.py file is used?

Using latest supervisor release: '1.2.3'.
Creating lambda layer with 'faas-supervisor' version '1.2.3'.
Creating function package.
..

Is it

  1. https://github.com/grycap/faas-supervisor/blob/master/extra/udocker.zip (inside the zip)
  2. https://download.ncg.ingrid.pt/webdav/udocker/udocker-1.1.3.tar.gz (url inside the udocker.py above)
  3. Or some other location entirely?

I'm thinking it's number 2 above?

gmolto commented 4 years ago

Ping @srisco

WinstonN commented 4 years ago

mmm I downloaded the layer, edited it locally, to include the /mnt dir, uploaded it and created a revision. Updated my lambda function, to reflect the layer, in the function configuration as well as the function_config.yaml

When I run the function I get the error

Error in function response: {'errorMessage': "Unable to import module 'lambda-name': No module named 'faassupervisor'", 'errorType': 'Runtime.ImportModuleError'}
'headers'
srisco commented 4 years ago

Hi @WinstonN,

as you have noticed, faas-supervisor is the code executed in the lambda function in order to run the container. The problem is that the EFS in lambda is not automatically mounted in the container and you cannot see it. Your approach editing the supervisor's layer should work, please let me know what files have you edited (you could open a PR in faas-supervisor) and I'll try to help you.

For a PR in SCAR the idea is provide an option in function configuration files to create and mount EFS file systems.

Thank you for your work!

WinstonN commented 4 years ago

@srisco @gmolto

I got it! I'll try and make this configurable via env vars, with a fallback to default

diff --git a/faassupervisor/faas/aws_lambda/udocker.py b/faassupervisor/faas/aws_lambda/udocker.py
index ba7326f..2ef10f1 100644
--- a/faassupervisor/faas/aws_lambda/udocker.py
+++ b/faassupervisor/faas/aws_lambda/udocker.py
@@ -123,7 +123,7 @@ def _create_command(self):
     def _add_container_volumes(self):
         self.cont_cmd.extend(["-v", SysUtils.get_env_var("TMP_INPUT_DIR")])
         self.cont_cmd.extend(["-v", SysUtils.get_env_var("TMP_OUTPUT_DIR")])
-        self.cont_cmd.extend(["-v", "/dev", "-v", "/proc", "-v", "/etc/hosts", "--nosysdirs"])
+        self.cont_cmd.extend(["-v", "/dev", "-v", "/mnt", "-v", "/proc", "-v", "/etc/hosts", "--nosysdirs"])
         if SysUtils.is_var_in_env('EXTRA_PAYLOAD'):
             self.cont_cmd.extend(["-v", self.lambda_instance.PERMANENT_FOLDER])

I struggled a bit because I was not sure which part of the code is used where, but I'm getting used to the code flow now

The result is beautiful efs share

❯❯❯ python3 scar/scarcli.py run -n lambda-name ls -la /mnt
Request Id: 48537bf3-3420-41a3-a4ed-65b695be572c
Log Group Name: /aws/lambda/lambda-name
Log Stream Name: 2020/07/14/[$LATEST]82532a922dde48c28c6df98d0e72dbc2
total 12
drwxr-xr-x  3 root root 4096 Jul 14 08:37 .
dr-xr-xr-x 21 root root 4096 Apr 11 01:51 ..
drwxrwxrwx  2 1000 1000 6144 Jul 13 22:07 efs

Thanks for all the help!

carlreiser2 commented 4 years ago

@WinstonN PR? This functionality would be awesome

WinstonN commented 4 years ago

@carlreiser2 you can try just making the change I made above (see the diff in my previous comment) - that might work I also made a bunch of other changes, here: https://github.com/WinstonN/faas-supervisor/commits/master and how you get those changes in your lambda, is to set in utils.py your own github user, because the code is written in such a way that, it looks for a release (in the latest I think it is 1.2.3) and then pulls a zip to setup faas-supervisor

diff --git a/scar/utils.py b/scar/utils.py
index a08a944..d8bd83b 100644
--- a/scar/utils.py
+++ b/scar/utils.py
@@ -454,7 +454,8 @@ class SupervisorUtils:
     https://github.com/grycap/faas-supervisor/"""

     _SUPERVISOR_GITHUB_REPO = 'faas-supervisor'
-    _SUPERVISOR_GITHUB_USER = 'grycap'
+    # _SUPERVISOR_GITHUB_USER = 'grycap'
+    _SUPERVISOR_GITHUB_USER = 'WinstonN'
     _SUPERVISOR_GITHUB_ASSET_NAME = 'supervisor'

     @classmethod

The zip is located https://github.com/grycap/faas-supervisor/tree/master/extra

I didn't setup a PR, because even though the code is stable on my local, I need much more time to make it PR ready There is a few issues running large lambda loads with scar / udocker

  1. Because it downloads docker images from dockerhub (i have not been able to make it work with ECR) you are going to pay a lot of the NAT gateway transfer costs of your lambdas
  2. I got frequent timeouts, where the lambdas were unable to connect to docker-hub (for some weird reason). It would time out, constantly, then the next moment, work, then work for a while, then start the timeouts again
  3. Each time you update anything in the ZIP you have to either delete the faas layer and upload it, or use scar to create a new function, which will pull the zip and setup the layer, then go to your other function and reference the right layer there

You can see why I didn't make a PR....... Good luck

EliteMasterEric commented 4 years ago

I would really like to see some of the necessary changes made to add this feature to the main branch.

The necessary changes for this include: