mdneuzerling / lambdr

Run R containers on AWS Lambda
https://lambdr.mdneuzerling.com
Other
131 stars 12 forks source link

R + Python + Conda + Reticulate works locally but not when deployed #13

Open arkadi-aigora opened 2 years ago

arkadi-aigora commented 2 years ago

See the code in the repo

https://github.com/arkadi-aigora/r_python_reticulate_lambda_demo

I am able to deploy the image locally using the following commands

docker build -t test1 .

docker run -p 9000:8080 test1

And when testing locally in the console with CURL

curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"number":"1"}'

I get well a response of 6 calculated by x = reticulate::py_eval('1+2+3')

When the image is pushed to ERC and then deployed on lambda I get an error, below is the log

2022-02-22T11:17:08.512+04:00 | # >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<
-- | --
  | 2022-02-22T11:17:08.512+04:00 | Traceback (most recent call last):
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/exceptions.py", line 1079, in __call__
  | 2022-02-22T11:17:08.512+04:00 | return func(*args, **kwargs)
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/cli/main.py", line 84, in _main
  | 2022-02-22T11:17:08.512+04:00 | exit_code = do_call(args, p)
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/cli/conda_argparse.py", line 83, in do_call
  | 2022-02-22T11:17:08.512+04:00 | return getattr(module, func_name)(args, parser)
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/cli/main_run.py", line 25, in execute
  | 2022-02-22T11:17:08.512+04:00 | script_caller, command_args = wrap_subprocess_call(on_win, context.root_prefix, prefix,
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/utils.py", line 403, in wrap_subprocess_call
  | 2022-02-22T11:17:08.512+04:00 | with Utf8NamedTemporaryFile(mode='w', prefix=tmp_prefix, delete=False) as fh:
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/site-packages/conda/_vendor/auxlib/compat.py", line 81, in Utf8NamedTemporaryFile
  | 2022-02-22T11:17:08.512+04:00 | return NamedTemporaryFile(mode=mode, buffering=buffering, encoding=encoding,
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/tempfile.py", line 541, in NamedTemporaryFile
  | 2022-02-22T11:17:08.512+04:00 | (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  | 2022-02-22T11:17:08.512+04:00 | File "/opt/conda/lib/python3.9/tempfile.py", line 251, in _mkstemp_inner
  | 2022-02-22T11:17:08.512+04:00 | fd = _os.open(file, flags, 0o600)
  | 2022-02-22T11:17:08.512+04:00 | OSError: [Errno 30] Read-only file system: '/opt/conda/.tmp_0po1xji'
  | 2022-02-22T11:17:08.512+04:00 | `$ /opt/conda/bin/conda run --prefix /opt/conda --no-capture-output python -c import os; print(os.environ['PATH'])`
  | 2022-02-22T11:17:08.512+04:00 | environment variables:
  | 2022-02-22T11:17:08.512+04:00 | CIO_TEST=<not set>
  | 2022-02-22T11:17:08.512+04:00 | CONDA_ROOT=/opt/conda
  | 2022-02-22T11:17:08.512+04:00 | CURL_CA_BUNDLE=<not set>
  | 2022-02-22T11:17:08.512+04:00 | LD_LIBRARY_PATH=/opt/R/4.1.0/lib/R/lib:/usr/local/lib:/usr/lib/jvm/java-1.8.0-openjdk-
  | 2022-02-22T11:17:08.512+04:00 | 1.8.0.322.b06-1.el7_9.x86_64/jre/lib/amd64/server:/var/lang/lib:/lib64
  | 2022-02-22T11:17:08.512+04:00 | :/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt
  | 2022-02-22T11:17:08.512+04:00 | /lib
  | 2022-02-22T11:17:08.512+04:00 | PATH=/opt/conda/bin:/var/lang/bin:/usr/local/bin:/usr/bin/:/bin:/opt/bin:/o
  | 2022-02-22T11:17:08.512+04:00 | pt/R/4.1.0/bin/
  | 2022-02-22T11:17:08.512+04:00 | REQUESTS_CA_BUNDLE=<not set>
  | 2022-02-22T11:17:08.512+04:00 | SSL_CERT_FILE=<not set>
  | 2022-02-22T11:17:08.512+04:00 | active environment : None
  | 2022-02-22T11:17:08.512+04:00 | user config file : /home/sbx_user1051/.condarc
  | 2022-02-22T11:17:08.512+04:00 | populated config files :
  | 2022-02-22T11:17:08.512+04:00 | conda version : 4.10.3
  | 2022-02-22T11:17:08.512+04:00 | conda-build version : not installed
  | 2022-02-22T11:17:08.512+04:00 | python version : 3.9.5.final.0
  | 2022-02-22T11:17:08.512+04:00 | virtual packages : __linux=4.14.252=0
  | 2022-02-22T11:17:08.512+04:00 | __glibc=2.26=0
  | 2022-02-22T11:17:08.512+04:00 | __unix=0=0
  | 2022-02-22T11:17:08.512+04:00 | __archspec=1=x86_64
  | 2022-02-22T11:17:08.512+04:00 | base environment : /opt/conda (read only)
  | 2022-02-22T11:17:08.512+04:00 | conda av data dir : /opt/conda/etc/conda
  | 2022-02-22T11:17:08.512+04:00 | conda av metadata url : None
  | 2022-02-22T11:17:08.512+04:00 | channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
  | 2022-02-22T11:17:08.512+04:00 | https://repo.anaconda.com/pkgs/main/noarch
  | 2022-02-22T11:17:08.512+04:00 | https://repo.anaconda.com/pkgs/r/linux-64
  | 2022-02-22T11:17:08.512+04:00 | https://repo.anaconda.com/pkgs/r/noarch
  | 2022-02-22T11:17:08.512+04:00 | package cache : /opt/conda/pkgs
  | 2022-02-22T11:17:08.512+04:00 | /home/sbx_user1051/.conda/pkgs
  | 2022-02-22T11:17:08.512+04:00 | envs directories : /home/sbx_user1051/.conda/envs
  | 2022-02-22T11:17:08.512+04:00 | /opt/conda/envs
  | 2022-02-22T11:17:08.512+04:00 | platform : linux-64
  | 2022-02-22T11:17:08.512+04:00 | user-agent : conda/4.10.3 requests/2.25.1 CPython/3.9.5 Linux/4.14.252-207.481.amzn2.x86_64 amzn/2 glibc/2.26
  | 2022-02-22T11:17:08.512+04:00 | UID:GID : 993:990
  | 2022-02-22T11:17:08.512+04:00 | netrc file : None
  | 2022-02-22T11:17:08.512+04:00 | offline mode : False
  | 2022-02-22T11:17:08.512+04:00 | An unexpected error has occurred. Conda has prepared the above report.
  | 2022-02-22T11:17:08.600+04:00 | ERROR [2022-02-22 07:17:08] wrong length for argument
  | 2022-02-22T11:17:08.608+04:00 | END RequestId: a44cab24-8e1f-482f-9eda-91e554b22a1f

Is there a chance this could be resolved?

mdneuzerling commented 2 years ago

Hi @arkadi-aigora. I'm a bit short on time for the next few weeks, but I'll do my best to help you out.

It looks like reticulate calls Python by using conda run. However, conda run does not work in read-only environments. When a Lambda container is running in AWS the only write-able directory is /tmp.

Fortunately, it looks like there's a recent patch for this issue in conda that was merged only 8 days ago. I don't know how miniconda updates work, but is it possible to use this latest version somehow?

I also recommend setting ENV TMPDIR /tmp in your Dockerfile. The above patch appears to use this.

taylo5jm commented 2 years ago

@arkadi-aigora I recently solved some configuration issues with R + conda + reticulate. I can't comment too much on the exact issue you dealt with, however, I can show the steps I used in case that might be helpful to you or anyone else in the future.

First, I installed conda using the system package manager in my Dockerfile.

# Import our GPG public key
RUN rpm --import https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc

# Add the Anaconda repository
RUN cat <<EOF > /etc/yum.repos.d/conda.repo
[conda]
name=Conda
baseurl=https://repo.anaconda.com/pkgs/misc/rpmrepo/conda
enabled=1
gpgcheck=1
gpgkey=https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc
EOF

# Install Anaconda
RUN yum -y install conda
# Add Anaconda to path
ENV PATH="${PATH}:/opt/conda/bin/"

Then, you can run conda_create in the Docker build to create a conda environment.

RUN Rscript -e "reticulate::conda_create('my-conda-env')"

Finally, in your R functions that use reticulate, use reticulate::use_condaenv to access your conda environment.

Thanks for the awesome package, @mdneuzerling !