ICIJ / node-tika

Apache Tika bridge for Node.js. Text and metadata extraction, language detection and more.
MIT License
140 stars 36 forks source link

Need help making tika work in AWS Lambda #18

Open alpeshgaglani opened 7 years ago

alpeshgaglani commented 7 years ago

Our scenario is to get .pdf files uploaded in AWS S3 storage and process it later. We want to move to AWS Lambda. However, Lambda requires that the entire package (along with all node_modules) be uploaded as a zip file (i.e. it wont run npm install). This means that tika picks up whatever java path that the local machine happened to have and save it in jvm_dll_path.json. The path to libjvm.so is different on the Lambda machine, and loading the module fails with "libjvm.so: cannot open shared object file: No such file or directory". I tried just replacing the string in jvm_dll_path.json with the correct AWS path, but no dice. Really appreciate any help to make this work on Lambda. Thanks!

drexler commented 7 years ago

@alpeshgaglani were you able to resolve this? I'm facing the same issue but for a different project.