iandow / mediainfo_aws_lambda

AWS Lambda function for MediaInfo
17 stars 13 forks source link

AWS Lambda not working with urls #4

Closed aniket-amagi closed 4 years ago

aniket-amagi commented 4 years ago

Hello,

I have forked your repo, and made some changes into Dockerfile to use the updated python and mediainfo. Also I updated app.py to accept and check videos based on signed_url of s3 objects but I was getting this error :- image After that I compiled the library with --with-libcurl and copied the library file in zip. But it is still not able to do it, I have moved multiple .so files to support libcurl but in the end it came to the same end image I tried changing the library location to both /opt/python and /opt , it still didn't work. Can you give me some insight to correct this one ?

aniket-amagi commented 4 years ago

Please let me know if you need some more details.

iandow commented 4 years ago

To troubleshoot ".so not found errors", I often use os.listdir() to verify that the .so file I need is in a directory that I've included in the LD_LIBRARY_PATH.

For other errors, I will often try to verify that my code works in an amazon linux Docker container, like this:

# start amazonlinux
# install prereqs for compiling mediainfo
yum -y install gcc openssl-devel bzip2-devel libffi-devel wget tar gzip make gcc-c++
wget https://www.python.org/ftp/python/3.8.0/Python-3.8.0.tgz
tar -xzvf Python-3.8.0.tgz
cd /Python-3.8.0
./configure --enable-optimizations
make install

# install mediainfo
pip3 install pymediainfo

# install libzen and libmediainfo
wget https://mediaarea.net/download/binary/libmediainfo0/19.09/MediaInfo_DLL_19.09_GNU_FromSource.tar.gz
tar -xzvf MediaInfo_DLL_20.03_GNU_FromSource.tar.gz
tar -xzvf MediaInfo_DLL_19.09_GNU_FromSource.tar.gz
cd MediaInfo_DLL_GNU_FromSource/
bash SO_Compile.sh
cd MediaInfoLib/Project/GNU/Library && make install

# test it
wget https://vjs.zencdn.net/v/oceans.mp4

python3 -c "from pymediainfo import MediaInfo; media_info = MediaInfo.parse('/root/oceans.mp4', library_file='/root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/libmediainfo.so.0')"
aniket-amagi commented 4 years ago

I tried testing my code as you told me (using wget and python3) and it works perfectly fine there without any errors. But when I use them as layers it fails. Any clues ?

ianwow commented 4 years ago

Did you use os.listdir() to verify that the .so files you need are in a directory that you've included in the LD_LIBRARY_PATH?

aniket-amagi commented 4 years ago

Yes

Try 1

image I have used the following Dockerfile :

FROM amazonlinux
WORKDIR / RUN yum update -y RUN yum -y install openssl-devel bzip2-devel libffi-devel wget tar gzip make gcc-c++ libcurl-devel libssh2-devel libnghttp2-devel libidn2-devel openldap-devel zip RUN wget https://www.python.org/ftp/python/3.8.4/Python-3.8.4.tgz RUN tar -xzvf Python-3.8.4.tgz WORKDIR /Python-3.8.4 RUN ./configure --enable-optimizations RUN make install RUN mkdir /packages ADD requirements.txt /packages/requirements.txt RUN mkdir -p /packages/pymediainfo-3.8/python/lib/python3.8/site-packages RUN pip3.8 install -r /packages/requirements.txt -t /packages/pymediainfo-3.8/python/lib/python3.8/site-packages WORKDIR /root RUN wget https://mediaarea.net/download/binary/libmediainfo0/20.03/MediaInfo_DLL_20.03_GNU_FromSource.tar.gz RUN tar -xvzf MediaInfo_DLL_20.03_GNU_FromSource.tar.gz WORKDIR /root/MediaInfo_DLL_GNU_FromSource/ RUN ./SO_Compile.sh --with-libcurl RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/ /packages/pymediainfo-3.8/python RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/ /packages/pymediainfo-3.8/ WORKDIR /packages/pymediainfo-3.8/ RUN zip -r9 /packages/pymediainfo-python38.zip . WORKDIR /packages/ RUN rm -rf /packages/pymediainfo-3.8/

Try 2

Even If I used the following dockerfile :

FROM amazonlinux WORKDIR / RUN yum update -y RUN yum -y install openssl-devel bzip2-devel libffi-devel wget tar gzip make gcc-c++ libcurl-devel libssh2-devel libnghttp2-devel libidn2-devel openldap-devel zip RUN wget https://www.python.org/ftp/python/3.8.4/Python-3.8.4.tgz RUN tar -xzvf Python-3.8.4.tgz WORKDIR /Python-3.8.4 RUN ./configure --enable-optimizations RUN make install RUN mkdir /packages ADD requirements.txt /packages/requirements.txt RUN mkdir -p /packages/pymediainfo-3.8/python/lib/python3.8/site-packages RUN pip3.8 install -r /packages/requirements.txt -t /packages/pymediainfo-3.8/python/lib/python3.8/site-packages WORKDIR /root RUN wget https://mediaarea.net/download/binary/libmediainfo0/20.03/MediaInfo_DLL_20.03_GNU_FromSource.tar.gz RUN tar -xvzf MediaInfo_DLL_20.03_GNU_FromSource.tar.gz WORKDIR /root/MediaInfo_DLL_GNU_FromSource/ RUN ./SO_Compile.sh --with-libcurl RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/ /packages/pymediainfo-3.8/python RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/ /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libcurl.so /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libcurl.so /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libnghttp2.so /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libnghttp2.so /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libidn2.so /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libidn2.so /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libssh2.so /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libssh2.so /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libldap /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libldap /packages/pymediainfo-3.8/ RUN cp /usr/lib64/liblber /packages/pymediainfo-3.8/python RUN cp /usr/lib64/liblber /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libunistring /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libunistring /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libsasl2 /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libsasl2 /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libssl /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libssl /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libsmime3 /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libsmime3 /packages/pymediainfo-3.8/ RUN cp /usr/lib64/libnss3 /packages/pymediainfo-3.8/python RUN cp /usr/lib64/libnss3 /packages/pymediainfo-3.8/ WORKDIR /packages/pymediainfo-3.8/ RUN zip -r9 /packages/pymediainfo-python38.zip . WORKDIR /packages/ RUN rm -rf /packages/pymediainfo-3.8/

image

It still fails!

ianwow commented 4 years ago

I see libcurl.so.4 is not found, from your screenshot. I think maybe that file is not in your LD_LIBRARY_PATH.

aniket-amagi commented 4 years ago

Well I tried with libcurl moved to /opt/python and /opt (above second screen shot), but it still gave the same error. Sorry for the confusing post.

aniket-amagi commented 4 years ago

@iandow any more suggestions , I am kind of stuck now ?

iandow commented 4 years ago

Perhaps try bundling all the required libraries in a single zip file, instead of using a Lambda layer. If that works, then you could try moving the required libraries to a lambda layer, one library at a time. That might help determine whether your libraries and LD_LIBRARY_PATH is valid.

aniket-amagi commented 4 years ago

Hello @iandow , I tried packing all the library and python package in one zip with lambda_handler file. But it still failed.(stating its not able to figure out the location of libcurl library).Then I changed LD_LIBRARY_PATH = /var/task ( AWS lambda zip location). But it is throwing the same error to unable to open the file image Do you feel changing any different version will help ? Any other suggestions ?

ianwow commented 4 years ago

Try validating your library path with a different library, just to rule out library path issues. If you can successfully import some other library but not libcurl, then I suspect there's something wrong with your libcurl build.

I'd like to close this thread since your problem relates to your extensions rather than this repo. Please make an issue on your fork and paste a link here so we can continue the discussion there.

aniket-amagi commented 4 years ago

I am closing this one, to be continued here : https://github.com/aniket-amagi/mediainfo_aws_lambda/issues/1

DarrenStack commented 2 years ago

Hey! just wondering if there was any solution to this, looks like the fork isn't there anymore 😅 Running into a very similar issue. I created a layer that I believed would support 3.7 and 3.9. It works perfectly with lambdas in a 3.7 runtime but in an almost identical lambda in a 3.9 runtime, I get the same error as @aniket-amagi . My Dockerfile is attached.

FROM amazonlinux

WORKDIR /
RUN yum update -y

# Install Python 3.9
RUN yum -y install openssl-devel bzip2-devel libffi-devel wget tar gzip zip make gcc-c++
RUN wget https://www.python.org/ftp/python/3.9.9/Python-3.9.9.tgz
RUN tar -xzvf Python-3.9.9.tgz
WORKDIR /Python-3.9.9
RUN ./configure --enable-optimizations
RUN make install

# Install python 3.7
RUN yum install python3 -y

# Install Python packages
RUN mkdir /packages
RUN echo "pymediainfo" >> /packages/requirements.txt
RUN mkdir -p /packages/pymediainfo-3.9/python/lib/python3.9/site-packages
RUN pip3.9 install -r /packages/requirements.txt -t /packages/pymediainfo-3.9/python/lib/python3.9/site-packages

RUN mkdir -p /packages/pymediainfo-3.7/python/lib/python3.7/site-packages
RUN pip3.7 install -r /packages/requirements.txt -t /packages/pymediainfo-3.7/python/lib/python3.7/site-packages

# Download MediaInfo
WORKDIR /root
RUN wget https://mediaarea.net/download/binary/libmediainfo0/19.09/MediaInfo_DLL_19.09_GNU_FromSource.tar.gz
RUN tar -xzvf MediaInfo_DLL_19.09_GNU_FromSource.tar.gz

# Compile MediaInfo with Support for URL Inputs
WORKDIR /root/MediaInfo_DLL_GNU_FromSource/
RUN ./SO_Compile.sh

# Create zip files for Lambda Layer deployment
RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/* /packages/pymediainfo-3.7/python
RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/* /packages/pymediainfo-3.9/python
RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/* /packages/pymediainfo-3.9/
RUN cp /root/MediaInfo_DLL_GNU_FromSource/MediaInfoLib/Project/GNU/Library/.libs/* /packages/pymediainfo-3.7/
WORKDIR /packages/pymediainfo-3.7/
RUN zip -r9 /packages/pymediainfo-python37.zip .
WORKDIR /packages/pymediainfo-3.9/
RUN zip -r9 /packages/pymediainfo-python39.zip .
WORKDIR /packages/
RUN rm -rf /packages/pymediainfo-3.7/
RUN rm -rf /packages/pymediainfo-3.9/
iandow commented 2 years ago

Can you use the prebuilt mediainfo lambda layer instead of building one from scratch?

DarrenStack commented 2 years ago

That works! Don't know how I didn't try that before. Sorry to summon you back to this project after so long for this 😅 Thanks so much!

ianwow commented 2 years ago

Anytime.

RawichK commented 2 years ago

@iandow I tried to use prebuilt mediainfo lambda layer but the lambda function throw error of pymediainfo module not found instead. Could you show me how to use this prebuilt mediainfo lambda layer?

[ERROR] Runtime.ImportModuleError: Unable to import module 'index': No module named 'pymediainfo'

RawichK commented 2 years ago

I'm able to solve my issue. Only adding prebuilt lib to lambda layer was not work. So I need to modify the DockerBuild file and download prebuilt then add lib to the working directory and then zip prebuilt lib instead of build from scratch.

# Create zip files for Lambda Layer deployment
RUN yum install unzip -y
RUN wget https://mediaarea.net/download/binary/libmediainfo0/22.06/MediaInfo_DLL_22.06_Lambda_x86_64.zip
RUN unzip MediaInfo_DLL_22.06_Lambda_x86_64.zip
RUN cp /root/MediaInfo_DLL_GNU_FromSource/lib/* /packages/pymediainfo-3.9
RUN rm -rf /root/MediaInfo_DLL_GNU_FromSource/lib
WORKDIR /packages/pymediainfo-3.9/
RUN zip -r9 /packages/pymediainfo-python39.zip .
WORKDIR /packages/
RUN rm -rf /packages/pymediainfo-3.9/

Notes: No need LD_LIBRARY_PATH environment variable. Launch mediainfo using

media_info = MediaInfo.parse(signed_url, library_file='/opt/libmediainfo.so')