Closed gavinLow8128 closed 4 years ago
Hm, never seen this error before, and I also did not understand what you are actually trying to do.
But the error looks like you are import fitz from within the installation folder of PyMuPDF (where __init__.py
lives). This can never work.
Thank you for the quickly response. I tried to install PyMuPDF by following the instruction listed in the following link. AWS Lambda Deployment Package in Python
It seems that PyMuPDF and my program(app.py, the python script that I imported fitz) will be placed together. Is it the main reason for causing the error? Please let me know if there is any solution
Thank you very much
Hm, actually not. The above structure works if executed locally on your computer - just tried it out (of course, the dist-info
folder is not required). The imported fitz is confirmed to be taken from the fitz subfolder next to app.py
.
So the problem must be how AWS Lambda supports this type of thing. I am no user, so do not know anything about it.
Code – The code and dependencies of your function. For scripting languages, you can edit your function code in the embedded editor. To add libraries, or for languages that the editor doesn't support, upload a deployment package. If your deployment package is larger than 50 MB, choose Upload a file from Amazon S3.
This quotation from AWS Lambda websites seems suggests that you must upload PyMuPDF as a deployment package. Did you do that?
Also have a look at this:
Note: For libraries that use extension modules written in C or C++, build your deployment package in an Amazon Linux environment. You can use the SAM CLI build command, which uses Docker, or build your deployment package on Amazon EC2 or AWS CodeBuild.
PyMuPDF falls under this category ...
Problem is solved by deploying the package through AWS codeBuild. Thank you for your help!
@gavinLow8128 running into the same issue. Is it possible to share your steps for deploying the package through AWS codeBuild? Thanks!
@ale-de-vries
Step 1: Go to CodeCommit and "create repository", then upload your project to CodeCommit.
Step 2: Create a file called "buildspec.yml". Here is my buildspec.yml for your reference.
Step 3: Go to CodeBuild and "Create Build Project". You may watch this youtube video as a reference for how to complete the project configuration. https://www.youtube.com/watch?v=6YQFcd_z4gk
Step 4: After completing the configuration, "start build" the project.
Step 5: If the project is built successfully, go to S3 bucket. CodeBuild will upload the artifact file and put it into your s3Bucket. Find it and copy the "Object URL"
Step 6: Go to your lambda Function. Select "Upload a file from Amazon S3" for "Code entry type" and paste the "Object URL" to "Amazon S3 link URL". Then "save".
I hope it helps you. Please let me know if you have any other questions.
Hi @gavinLow8128 ,
I tried your solution but I am still getting the same error as you. Any idea what am I doing wrong? Will you be able to share your folder structure?
For anyone else that ends up here with the same problem:
You can use lambda layers to provide the pymupdf dependency, instead of building it into the deployment package. There's already a maintained layer for it, see here: https://github.com/keithrozario/Klayers
That way you don't need to bother with building on amazon linux, just reference the arn of the layer you need in your lambda function creation (region specific) and import as normal.
I am facing the same issue for IBM Cloud Function (similar to AWS Lambda)
I have used same build step "pip install PyMuPDF -t ." in deployment step and can see folder structure mentioned in https://github.com/pymupdf/PyMuPDF/issues/430#issuecomment-576208987
In My code,
import fitz
and getting below error -
"2020-11-26T07:23:33.653912Z stderr: Traceback (most recent call last):",
"2020-11-26T07:23:33.653966Z stderr: File "exec.py", line 42, in
I tried 2-3 ways doing this but getting same issue like "pip install PyMuPDF" "pip install PyMuPDF==1.16.10 -t ." "pip install PyMuPDF==1.18.10 -t ."
I am using other packages like pypdf, pdfminer using same way and they are working fine but not this one..
Not got any issues during build step only getting issue for import statement.
For anyone else that ends up here with the same problem:
You can use lambda layers to provide the pymupdf dependency, instead of building it into the deployment package. There's already a maintained layer for it, see here: https://github.com/keithrozario/Klayers
That way you don't need to bother with building on amazon linux, just reference the arn of the layer you need in your lambda function creation (region specific) and import as normal.
@jmac105 Any idea how this individual got it to work? I have PyMuPDF as a package in my layer but I am still getting the exact same error as the individual who opened this ticket. All other packages in my layer are importing correctly. Any help would be greatly appreciated. I also appreciate the arn repo but am hoping to avoid using this if at all possible.
For anyone else that ends up here with the same problem: You can use lambda layers to provide the pymupdf dependency, instead of building it into the deployment package. There's already a maintained layer for it, see here: https://github.com/keithrozario/Klayers That way you don't need to bother with building on amazon linux, just reference the arn of the layer you need in your lambda function creation (region specific) and import as normal.
@jmac105 Any idea how this individual got it to work? I have PyMuPDF as a package in my layer but I am still getting the exact same error as the individual who opened this ticket. All other packages in my layer are importing correctly. Any help would be greatly appreciated. I also appreciate the arn repo but am hoping to avoid using this if at all possible.
I'd recommend contacting maintainer of that repo and try asking them, but it looks like they are using severless framework to build the layers. I do believe that you need to build your layer on the same OS as it will run in on lambda (amazon linux or amazon linux 2 depending on python version).
Hey guys,
I have the same issue deploy with serverless framework
I tryed to create a layer, but the issue still the same.
This is the error I'm getting on lambda console:
Just in case it helps anyone, I was using Fitz in Lambda just fine for the past several months (I automate the build this way) under the Python 3.6 runtime. When I switched to the Python 3.8 run time, I started getting this import error. I switched back to 3.6 and everything is working fine again.
Hi everyone
I had the same issue deploying my lambda.
I tried many ways to solved this problem but the only solution was using a vitual machine with python 3.7 and install PyMuPDF.
The next steep was download the library from your virtual machine. And Then create your zip file using that library
probably this file is to heavy < _fitz.cpython-37m-x86_64-linux-gnu.so > but is necesary.
This method finally worked for me!!
I had a similar problem when I was trying to import some tensorflow probability modules like below: import tensorflow_probability as tfp tfp = tfp.substrates.numpy tfd = tfp.distributions
At least for me, I realized that the problem was not related to lambda but it was a Python circular import error. Have a look at this: https://stackabuse.com/python-circular-imports
Changing the position of the imports solved my issue. Basically I switched the tfp and tfd import tensorflow_probability as tfp tfd = tfp.distributions tfp = tfp.substrates.numpy
I'm stuck on the same problem for a python sftp package that requires paramiko. I compiled a package and tested it on a windows EC2 instance without issue. When I then tried to make it a lambda to see if I could accomplish the task serverlessly I got the same error about non-native packages. I'm trying to recreate the CodeCommit/CodeBuild solution but I'm getting an error with the buildspec.yaml: Phase context status code: YAML_FILE_ERROR Message: mapping values are not allowed in this context at line 2
My buildspec.yaml is: version: 0.1
phases: install: runtime-versions: python: 3.8 pre_build: commands: build: commands:
I'm not certain if the problem is the yaml (it passed a yaml parser) or the contents of my CodeCommit. All I have there is my python script and the yaml document. Does a download of the package need to be there?
TIA, VtR
I'm getting this exact error as well:
Runtime.ImportModuleError: Unable to import module 'lambda': cannot import name '_fitz' from partially initialized module 'fitz' (most likely due to a circular import) (/var/task/fitz/__init__.py)
I'm already using KLayers to generate the Layer.
This is what my zip file contains:
Anyone else who made it work with KLayers, give us more details please.
If you use pip to install a python package locally, which contains compiled code, the package (wheel) that is downloaded may not be compatible with AWS Lambda (in my case it was mac rather than linux). So if you deploy this locally installed file to your lambda this will cause the error, fitz not found, when you run your lambda, even if your code works locally. This will be the case with any binary package, not just fitz.
Lambdas need a linux compatible binary. As noted in some of the answers above the solution is to package the binary and load it as a lambda layers. This is easy to do in three steps that worked for me: 1) download the relevant binary, some aws advice here. Basically you need to unzip the relevant wheel file. 2) re-package the binary in a zip file with the right structure. See this stackoverflow answer 3) create a layer and upload the zip file using this aws page.
For anyone else that ends up here with the same problem:
You can use lambda layers to provide the pymupdf dependency, instead of building it into the deployment package. There's already a maintained layer for it, see here: https://github.com/keithrozario/Klayers
That way you don't need to bother with building on amazon linux, just reference the arn of the layer you need in your lambda function creation (region specific) and import as normal.
This answer seems to be the best one up to now, it works currently, to anyone still receiving the erro [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name '_fitz' from partially initialized module 'fitz' (most likely due to a circular import) (/var/task/fitz/__init__.py)
using this solution, the proper solution is to use the python3.8 lambda environment My function definition is as follow:
parse_pdf:
runtime: python3.8
handler: pdfparse/pdfparse/handler.pdf_parse
layers:
- arn:aws:lambda:${self:provider.region}:770693421928:layer:Klayers-p38-PyMUPDF:1
It works just fine after defining it this way
If you use pip to install a python package locally, which contains compiled code, the package (wheel) that is downloaded may not be compatible with AWS Lambda (in my case it was mac rather than linux). So if you deploy this locally installed file to your lambda this will cause the error, fitz not found, when you run your lambda, even if your code works locally. This will be the case with any binary package, not just fitz.
Lambdas need a linux compatible binary. As noted in some of the answers above the solution is to package the binary and load it as a lambda layers. This is easy to do in three steps that worked for me:
I'm working on mac and it works perfectly. For those who need an example of how to install the package you can try this:
pip install pyMUPDF --upgrade --only-binary=:all: --platform manylinux_2_17_x86_64 --python-version 38
Hey all, @anai-s answer is right on point. I ran this command in the terminal:
pip install \ --platform manylinux2014_x86_64 \ --target=/Users/schlank/Documents/Code/pythonlayers/upload/python \ --implementation cp \ --python 3.9 \ --only-binary=:all: --upgrade \ --ignore-installed \ PyMuPDF
Some additional detail. Make sure you:
Then you're to go.
For those trying to get fitz working for Lambda using the python3.9
runtime, and you're on an M1 Mac... Try installing them with docker. This worked for me:
I put them in a folder named requirements/python
. Then I zip that up for the layer
mkdir -p requirements/python;
docker run \
-v "$(pwd)":/var/task "public.ecr.aws/sam/build-python3.9" \
/bin/sh -c "yum install -y mysql-devel && \
pip install -r requirements.txt --only-binary=:all: --platform manylinux_2_17_x86_64 -t requirements/python; \
exit";
I'm encountering an issue while attempting to add the Fitz library (PyMuPDF) to a Lambda layer. The error message I'm getting is:
{
"errorMessage": "Unable to import module 'lambda_function': cannot import name '_fitz' from partially initialized module 'fitz' (most likely due to a circular import) (/opt/python/lib/python3.11/site-packages/fitz/__init__.py)",
"errorType": "Runtime.ImportModuleError",
}
Function Logs
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name '_fitz' from partially initialized module 'fitz' (most likely due to a circular import) (/opt/python/lib/python3.11/site-packages/fitz/__init__.py)
I'm seeking guidance on successfully utilizing the Fitz library within a Lambda function. Below is the Lambda code snippet:
import fitz
def lambda_handler(event, context):
try:
print("Hello World")
except Exception as e:
print(f"Error in fileToTextract: {str(e)}")
Is there anyone who has successfully integrated this library into their Lambda and can offer advice on resolving this issue?
Not sure it will help everyone, but this works for me with the latest PyMuPDF (1.3.23) and Python 3.12:
import fitz_old as fitz
I tracked my problems to the recent rebase in PyMuPDF, and using the pre-rebased version worked like a charm for me.
It's not the best solution but works for now.
Worked for me when I added it as a layer. As mentioned before, the key is to install a compatible version and use the correct path in your .zip file.
pip install \
--platform manylinux2014_x86_64 \
--target=./python/lib/python3.12/site-packages \
--implementation cp \
--python-version 3.12 \
--only-binary=:all: --upgrade \
pymupdf
Then zip the code to upload as a layer. The file structure will look like:
my_layer.zip
└── python/
└── lib/
└── python3.12/
└── site-packages/
└── fitz/
No need of layer, no need of CodeBuild.....!!!! This problem exists because the library fitz is written in C / C++ layering with python and it has specific set of architecture with specific OS. So, we need to build our deployment_package.zip specifying all these. Here is an article about it. I am sure it will solve all the pain above that I've seen here 😄 https://medium.com/@jayshwor.khadka/lambda-deployment-package-with-dependencies-and-local-built-distribution-wheels-with-different-affe82b982fa
No need of layer, no need of CodeBuild.....!!!! This problem exists because the library fitz is written in C / C++ layering with python and it has specific set of architecture with specific OS. So, we need to build our deployment_package.zip specifying all these. Here is an article about it. I am sure it will solve all the pain above that I've seen here 😄 https://medium.com/@jayshwor.khadka/lambda-deployment-package-with-dependencies-and-local-built-distribution-wheels-with-different-affe82b982fa
Thanks, it's work for me :)
@jacksonkasi1 Glad the solution proposed worked for you :D !! 👍
I am trying to develop a pdf to image serverless function by AWS lambda. The import statement is
import fitz
Howerver, I got the following error when triggering the lambda function.
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name '_fitz' from partially initialized module 'fitz' (most likely due to a circular import) (/var/task/fitz/__init__.py)
Thank you very much! I am using python 3.8 ,PyMuPDF-1.16.10