Closed yang-shuaijun closed 2 years ago
How is your dependencies.zip
file built, does it contain the native code extension modules?
I directly bundled the files in the /home/hadoop/.local/lib/python3.7/site-packages
Looking closely at the exception message I see that the path contains: /dependencies.zip/cryptography/hazmat/primitives/padding.py
. This tells me that spark directly loads python files out of the .zip
.
Unfortunately extension modules cannot be loaded out of the .zip
file, so this can't work. You'll need to find out if spark has some other way of specifying files that supports extension modules.
Looking closely at the exception message I see that the path contains:
/dependencies.zip/cryptography/hazmat/primitives/padding.py
. This tells me that spark directly loads python files out of the.zip
.Unfortunately extension modules cannot be loaded out of the
.zip
file, so this can't work. You'll need to find out if spark has some other way of specifying files that supports extension modules.
Spark will unzip the zip file and then load the python modules. Just like this
[hadoop@ip-172-31-21-108 ~]$ pyspark --py-files dependencies.zip
Python 3.7.10 (default, Jun 3 2021, 00:02:01)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)] on linux
Type "help", "copyright", "credits" or "license" for more information.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/aws/emr/emrfs/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/aws/redshift/jdbc/redshift-jdbc42-1.2.37.1061.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/03/16 11:39:11 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 3.1.2-amzn-1
/_/
Using Python version 3.7.10 (default, Jun 3 2021 00:02:01)
Spark context Web UI available at http://ip-172-31-21-108.cn-north-1.compute.internal:4040
Spark context available as 'sc' (master = yarn, app id = application_1647323952412_0014).
SparkSession available as 'spark'.
>>> from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
>>>
That import is pure python, pure python files can be imported directly from .zip
files but extension modules cannot.
Cipher, algorithms, modes
Are Cipher, algorithms, modes pure python code? It doesn't call other extension modules.
Those imports are pure python, actually using them to do encryption/decryption will call into extension modules.
On Wed, Mar 16, 2022 at 7:47 AM Picoman Yang @.***> wrote:
Cipher, algorithms, modes
Are Cipher, algorithms, modes pure python code? It doesn't call other extension modules.
— Reply to this email directly, view it on GitHub https://github.com/pyca/cryptography/issues/6968#issuecomment-1069039460, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBDPO44HKHERM7LLXLLVAHC3RANCNFSM5Q3PVYAA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you modified the open/close state.Message ID: @.***>
-- All that is necessary for evil to succeed is for good people to do nothing.
Thank your analysis in detail. The cryptography is excellent. I use it to mask the sensitive data.
Python: 3.7.10 pip: 20.2.2 cryptography: 36.0.2 Spark: 3.1.2-amzn-1
if I load the cryptography from --pyfiles, the error raised:
The dependencies.zip was bundled from /home/hadoop/.local/lib/python3.7/site-packages/.
But this method was able to load othen sub modules
If I directly install cryptography in emr platform, it worked well.