shelfio / libreoffice-lambda-layer

MIT License
112 stars 22 forks source link

com::sun::star::container::NoSuchElementException when executing lambda function with node 12.x #24

Open jayeshkulkarni opened 4 years ago

jayeshkulkarni commented 4 years ago

I am using arn:aws:lambda:us-east-1:764866452798:layer:libreoffice-gzip:1 and trying to test the function and getting below error

"Error: Command failed: cd /tmp && /tmp/instdir/program/soffice.bin --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp /tmp/SampleDOCFile_1000kb.docx",
    "terminate called after throwing an instance of 'com::sun::star::container::NoSuchElementException'",
    "/bin/sh: line 1:    24 Aborted                 (core dumped) /tmp/instdir/program/soffice.bin --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp /tmp/SampleDOCFile_1000kb.docx",
    "",
    "    at checkExecSyncError (child_process.js:629:11)",
    "    at execSync (child_process.js:666:13)",
    "    at Runtime.module.exports.librelayer [as handler] (/var/task/handler.js:41:13)",
    "    at Runtime.handleOnce (/var/runtime/Runtime.js:66:25)"
  ]
}
Andrea-Arguello commented 4 years ago

Having the same issue with Golang. I did my own lambda with the same lo.tar.gz. Could this be a file corruption issue? @shelfio

jayeshkulkarni commented 4 years ago

@shelfio @Andrea-Arguello Tried deploying the layer in diff region us-west-2 with ARN provided in description. Getting same issue so mostly file corruption is the issue as gzip is not working in any region. Plz verify.

randombluff commented 4 years ago

Hello, Go lang is using amazon Linux 1, as I remember. The last build is for amazon Linux 2, could you try old version?

randombluff commented 4 years ago

I'll see what can I do about gzip, sorry for the issues ;)

drewburnett commented 4 years ago

Same in Python

HenningUhlig commented 4 years ago

Are there any news regarding the gzip? For me (python3.8) it still doesn't work. Is there any way to use the lo.tar.br? How is it possible to unpack in a python lambda?

MattBriden commented 3 years ago

I ran into the same issue using arn:aws:lambda:us-east-1:764866452798:layer:libreoffice-gzip:1 with python3.8. I ended up finding a way to use the brotli file in python after reading this blog post. this code worked for me

import tarfile
import brotli
from io import BytesIO

class LibreOfficeLoader:

    @staticmethod
    def extract_libre_office():
        buffer = BytesIO()
        with open('/opt/lo.tar.br', mode='rb') as fout:
            file = fout.read()
            buffer.write(brotli.decompress(file))
            buffer.seek(0)
            with tarfile.open(fileobj=buffer) as tar:
                tar.extractall('/tmp')

Tested with os.system('/tmp/instdir/program/soffice.bin --help') to ensure I don't get the same error as with the gzip layer.

mohit-manna-ttn commented 3 years ago

Facing same issue in Ubuntu

mohit-manna commented 3 years ago

Facing same issue in Linux EC2 machine. Is there any way out of it??

mohit-manna commented 3 years ago

I ran into the same issue using arn:aws:lambda:us-east-1:764866452798:layer:libreoffice-gzip:1 with python3.8. I ended up finding a way to use the brotli file in python after reading this blog post. this code worked for me

import tarfile
import brotli
from io import BytesIO

class LibreOfficeLoader:

    @staticmethod
    def extract_libre_office():
        buffer = BytesIO()
        with open('/opt/lo.tar.br', mode='rb') as fout:
            file = fout.read()
            buffer.write(brotli.decompress(file))
            buffer.seek(0)
            with tarfile.open(fileobj=buffer) as tar:
                tar.extractall('/tmp')

Tested with os.system('/tmp/instdir/program/soffice.bin --help') to ensure I don't get the same error as with the gzip layer.

got brotli.error: BrotliDecompress failed Ubuntu 18. Python3.6

medianotion commented 2 years ago

For those trying to get this to work in AWS Lambda Python 3.8 Linux 2, the .gz file throws the error in the subject of this Issue. I couldn't find a way to make the .gz file work.

The solution is to use the brotli file. This will require you to make a Lambda Layer for brotli.

Create the brotli layer by following these instructions. I was using Ubuntu and assumes you have zip and pip installed. If not apt-get (install) those before you follow these instructions.

mkdir -p brotli/python cd brotli pip install Brotli cp -r /home/Your-User-Name-Here/.local/lib/python3.8/site-packages/Brotli-1.0.9.dist-info ./python cp -r /home/Your-User-Name-Here/.local/lib/python3.8/site-packages/brotli.py ./python cp -r /home/Your-User-Name-Here/.local/lib/python3.8/site-packages/_brotli.cpython-38-x86_64-linux-gnu.so ./python zip -r brotli-1.0.9.zip python

You now have a zip file for the brotli lambda layer here: ~/brotli/brotli-1.0.9.zip

Using the AWS Lambda Console, create a new layer and upload the zip file. Go to your lambda and associate this new brotli layer to it. Also create a layer for LibreOffice from this repo using the layer.tar.br.zip if you haven't done so already.

Use the code above from @manna018 to decompress the .br file using brotli library and start using LibreOffice.

NOTE: I was still getting "Fontconfig error: Cannot load default config file" errors when calling soffice.bin. I didn't see any issues with the output file so I ignored it.

HTH someone

Floppy commented 2 years ago

I've been looking into this - I really want to use gzip as we're having trouble with brotli despite @medianotion's very helpful instructions.

In the root of this repo, should the contents of layer.tar.br.zip and layer.tar.gz.zip be the same? I assume they should, the only different should be the compression?

Turns out all the .so files are different. Let's take instdir/program/libwriterperfectlo.so as an random example. When uncompressed, the file sizes are completely different - 54068 in the gz version, 68344 in brotli.

Is that weird, or am I on the wrong track? Might that explain why one works, and one doesn't?

Looking at the strings output, it looks like the gzipped version is compressed again using https://upx.github.io/.

Floppy commented 2 years ago

I've just recompressed the brotli layer as a gzip layer, and it works fine. There's definitely something wrong with the published gzip versions.

gitthub89 commented 9 months ago

I've just recompressed the brotli layer as a gzip layer, and it works fine. There's definitely something wrong with the published gzip versions.

how EXACTLY to do that? like what do i download, from where, how to recompress and I assume to upload means to upload the .zip file we recompressed to the layers page in aws lambda.. but as for the previous steps I am not sure.

gitthub89 commented 9 months ago

I've managed to do it, downloaded the brotli layer using the command:

aws lambda get-layer-version-by-arn --arn arn:aws:lambda:us-east-1:764866452798:layer:libreoffice-brotli:1 --query 'Content.Location' --output text | xargs wget -O layer.zip

Extract the lo.tar.br file.

Decompress the brotli file using the command (might need to install brotli with 'sudo apt install brotli')

brotli --decompress lo.tar.br -o lo.tar && gzip -9 lo.tar creating a lo.tar.gz file.

Then compressing it into .zip (right click -> "Compress...") and uploading that to any s3 bucket, copy the uploaded file url. Go to Lambda -> Layers -> Add Layer. Select "from s3..." pasting the url. Using that layer in my function actually finally worked. However it works only when the env is warm, when it's called and libreoffice is extracted for the first time i get this error: https://github.com/shelfio/libreoffice-lambda-layer/issues/20 I am using Python runtime (3.8 but 3.12 is the same). So if i run 2 times in succession then the 2nd works and finally got a converted pdf file! 🎉

sendurangr commented 2 months ago

Thanks ✅🚀 serverless-libreoffice/releases

import tarfile
import brotli
from io import BytesIO

def extract_libre_office():
    buffer = BytesIO()
    with open('/opt/lo.tar.br', mode='rb') as fout:
        file = fout.read()
        buffer.write(brotli.decompress(file))
        buffer.seek(0)
        with tarfile.open(fileobj=buffer) as tar:
            tar.extractall('/tmp')

helped me. Make sure you are extracting it in /tmp, as /tmp is writable on lambda execution. /opt only writable when lambda get created.

also you can check subprocess.run(['/tmp/instdir/program/soffice.bin', '--help'])

🧠 keep in mind

def convert_to_pdf(file_path, file_extension, file_name_only):
    output_pdf_path = f"/tmp/{file_name_only}.pdf"
    print(f"Converting file to PDF: {output_pdf_path}")
    if file_extension in ['.doc', '.docx', '.ppt', '.pptx', '.rtf', '.txt']:
        conv_cmd = (f"/tmp/instdir/program/soffice.bin --headless --norestore --invisible --nodefault "
                    f"--nofirststartwizard --nolockcheck --nologo --convert-to pdf:writer_pdf_Export --outdir /tmp "
                    f"{file_path}")
        response = subprocess.run(conv_cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        if response.returncode != 0:
            response = subprocess.run(conv_cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            if response.returncode != 0:
                raise Exception("Conversion failed to PDF file format from: " + file_path)

        return output_pdf_path
    else:
        raise Exception("Unsupported file format for conversion to PDF: " + file_extension)

look at the above method, im trying to hit it twice, only when it will work.