Open Le-Zheng opened 2 years ago
Sample spark submit :
/opt/spark/bin/spark-submit \
--master ${RUNTIME_SPARK_MASTER} \
--deploy-mode cluster \
--name simplequery \
--conf spark.driver.memory=20g \
--conf spark.executor.cores=16 \
--conf spark.executor.memory=20g \
--conf spark.executor.instances=1 \
--conf spark.cores.max=16 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
--conf spark.kubernetes.executor.deleteOnTermination=false \
--conf spark.network.timeout=10000000 \
--conf spark.executor.heartbeatInterval=10000000 \
--conf spark.python.use.daemon=false \
--conf spark.python.worker.reuse=false \
--conf spark.kubernetes.executor.podTemplateFile=/ppml/trusted-big-data-ml/spark-executor-template.yaml \
--conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/spark-executor-template.yaml \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.nfsvolumeclaim.options.claimName=nfsvolumeclaim \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.nfsvolumeclaim.mount.path=/bigdl2.0/data \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.nfsvolumeclaim.options.claimName=nfsvolumeclaim \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.nfsvolumeclaim.mount.path=/bigdl2.0/data \
--conf spark.authenticate=true \
--conf spark.authenticate.secret=intel@123 \
--conf spark.kubernetes.executor.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.kubernetes.driver.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.authenticate.enableSaslEncryption=true \
--conf spark.network.crypto.enabled=true --conf spark.network.crypto.keyLength=128 \
--conf spark.network.crypto.keyFactoryAlgorithm=PBKDF2WithHmacSHA1 \
--conf spark.io.encryption.enabled=true \
--conf spark.io.encryption.keySizeBits=128 \
--conf spark.io.encryption.keygen.algorithm=HmacSHA1 \
--conf spark.ssl.enabled=true \
--conf spark.ssl.port=8043 \
--conf spark.ssl.keyPassword=$secure_password \
--conf spark.ssl.keyStore=/bigdl2.0/data/keystore.jks \
--conf spark.ssl.keyStorePassword=$secure_password \
--conf spark.ssl.keyStoreType=JKS \
--conf spark.ssl.trustStore=/bigdl2.0/data/keystore.jks \
--conf spark.ssl.trustStorePassword=intel@123 \
--conf spark.ssl.trustStoreType=JKS \
--class com.intel.analytics.bigdl.ppml.examples.SimpleQuerySparkExample \
--conf spark.driver.extraClassPath=local:///bigdl2.0/data/ppml/bigdl-ppml-spark_3.1.2-2.1.0-20220612.193825-116-jar-with-dependencies.jar \
--conf spark.executor.extraClassPath=local:///bigdl2.0/data/ppml/bigdl-ppml-spark_3.1.2-2.1.0-20220612.193825-116-jar-with-dependencies.jar \
--jars local:///bigdl2.0/data/ppml/bigdl-ppml-spark_3.1.2-2.1.0-20220612.193825-116-jar-with-dependencies.jar \
local:///bigdl2.0/data/ppml/bigdl-ppml-spark_3.1.2-2.1.0-20220612.193825-116-jar-with-dependencies.jar \
--inputPath /bigdl2.0/data/ppml/people/encrypted \
--outputPath /bigdl2.0/data/ppml/people/people_encrypted_output \
--inputPartitionNum 16 \
--outputPartitionNum 16 \
--inputEncryptModeValue AES/CBC/PKCS5Padding \
--outputEncryptModeValue AES/CBC/PKCS5Padding \
--primaryKeyPath /bigdl2.0/data/ppml/20line_data_keys/primaryKey \
--dataKeyPath /bigdl2.0/data/ppml/20line_data_keys/dataKey \
--kmsType SimpleKeyManagementService \
--simpleAPPID 165172133285
When we replacebigdl-ppml-spark_3.1.2-2.1.0-20220612.193825-116-jar-with-dependencies.jar
with bigdl-ppml-spark_3.1.2-2.1.0-20220907.120744-222-jar-with-dependencies.jar
, the above issue occurs.
I just got the same error. here is my script:
rm -rf /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted_output && \
export mode=client && \
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
export SPARK_LOCAL_IP=$LOCAL_IP && \
./clean.sh
gramine-argv-serializer bash -c "/opt/jdk8/bin/java \
-cp '/ppml/trusted-big-data-ml/work/data/shansimu/ppml-e2e-examples/spark-encrypt-io/target/spark-encrypt-io-0.3.0-SNAPSHOT.jar:/ppml/trusted-big-data-ml/work/spark-3.1.2/examples/jars/scopt_2.12-3.7.1.jar:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/*:/ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/*' \
-Xmx8g \
org.apache.spark.deploy.SparkSubmit \
--master $RUNTIME_SPARK_MASTER \
--deploy-mode cluster \
--name spark-simplequery-sgx \
--conf spark.driver.host=$LOCAL_IP \
--conf spark.driver.port=54321 \
--conf spark.driver.memory=32g \
--conf spark.executor.cores=8 \
--conf spark.executor.memory=32g \
--conf spark.executor.instances=2 \
--conf spark.cores.max=32 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=$RUNTIME_K8S_SPARK_IMAGE \
--conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/spark-driver-template.yaml \
--conf spark.kubernetes.executor.podTemplateFile=/ppml/trusted-big-data-ml/spark-executor-template.yaml \
--conf spark.kubernetes.executor.deleteOnTermination=false \
--conf spark.network.timeout=10000000 \
--conf spark.executor.heartbeatInterval=10000000 \
--conf spark.python.use.daemon=false \
--conf spark.python.worker.reuse=false \
--conf spark.kubernetes.sgx.enabled=true \
--conf spark.kubernetes.sgx.driver.mem=64g \
--conf spark.kubernetes.sgx.driver.jvm.mem=12g \
--conf spark.kubernetes.sgx.executor.mem=64g \
--conf spark.kubernetes.sgx.executor.jvm.mem=12g \
--conf spark.kubernetes.sgx.log.level=error \
--conf spark.authenticate=true \
--conf spark.authenticate.secret=$secure_password \
--conf spark.kubernetes.executor.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.kubernetes.driver.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.authenticate.enableSaslEncryption=true \
--conf spark.network.crypto.enabled=true \
--conf spark.network.crypto.keyLength=128 \
--conf spark.network.crypto.keyFactoryAlgorithm=PBKDF2WithHmacSHA1 \
--conf spark.io.encryption.enabled=true \
--conf spark.io.encryption.keySizeBits=128 \
--conf spark.io.encryption.keygen.algorithm=HmacSHA1 \
--conf spark.ssl.enabled=true \
--conf spark.ssl.port=8043 \
--conf spark.ssl.keyPassword=$secure_password \
--conf spark.ssl.keyStore=/ppml/trusted-big-data-ml/work/keys/keystore.jks \
--conf spark.ssl.keyStorePassword=$secure_password \
--conf spark.ssl.keyStoreType=JKS \
--conf spark.ssl.trustStore=/ppml/trusted-big-data-ml/work/keys/keystore.jks \
--conf spark.ssl.trustStorePassword=$secure_password \
--conf spark.ssl.trustStoreType=JKS \
--conf spark.driver.extraClassPath=/ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/examples/jars/* \
--conf spark.executor.extraClassPath=/ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/*:/ppml/trusted-big-data-ml/work/spark-3.1.2/examples/jars/* \
--class com.intel.analytics.bigdl.ppml.examples.SimpleQuerySparkExample \
--verbose \
--jars local:///ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/bigdl-ppml-spark_3.1.2-2.1.0-SNAPSHOT.jar \
local:///ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/bigdl-ppml-spark_3.1.2-2.1.0-SNAPSHOT.jar \
--inputPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted \
--outputPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted_output \
--inputPartitionNum 8 \
--outputPartitionNum 8 \
--inputEncryptModeValue AES/CBC/PKCS5Padding \
--outputEncryptModeValue AES/CBC/PKCS5Padding \
--primaryKeyPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/keys/primaryKey \
--dataKeyPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/keys/dataKey \
--kmsType SimpleKeyManagementService \
--simpleAPPID 947536384638 \
--simpleAPPKEY 884926981201" > /ppml/trusted-big-data-ml/secured_argvs
./init.sh
gramine-sgx bash 2>&1 | tee query-client-simple.log
This may be due to the jar package path. Here is my early scrpit without this error:
rm -rf /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted_output && \
export mode=client && \
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
export SPARK_LOCAL_IP=$LOCAL_IP && \
./clean.sh
gramine-argv-serializer bash -c "/opt/jdk8/bin/java \
-cp '/ppml/trusted-big-data-ml/work/data/shansimu/ppml-e2e-examples/spark-encrypt-io/target/spark-encrypt-io-0.3.0-SNAPSHOT.jar:/ppml/trusted-big-data-ml/work/spark-3.1.2/examples/jars/scopt_2.12-3.7.1.jar:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/*:/ppml/trusted-big-data-ml/work/bigdl-2.1.0-SNAPSHOT/jars/*' \
-Xmx8g \
org.apache.spark.deploy.SparkSubmit \
--master $RUNTIME_SPARK_MASTER \
--deploy-mode client \
--name spark-simplequery-sgx \
--conf spark.driver.host=$LOCAL_IP \
--conf spark.driver.port=54321 \
--conf spark.driver.memory=32g \
--conf spark.executor.cores=8 \
--conf spark.executor.memory=32g \
--conf spark.executor.instances=2 \
--conf spark.cores.max=32 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=$RUNTIME_K8S_SPARK_IMAGE \
--conf spark.kubernetes.driver.podTemplateFile=/ppml/trusted-big-data-ml/spark-driver-template.yaml \
--conf spark.kubernetes.executor.podTemplateFile=/ppml/trusted-big-data-ml/spark-executor-template.yaml \
--conf spark.kubernetes.executor.deleteOnTermination=false \
--conf spark.network.timeout=10000000 \
--conf spark.executor.heartbeatInterval=10000000 \
--conf spark.python.use.daemon=false \
--conf spark.python.worker.reuse=false \
--conf spark.kubernetes.sgx.enabled=true \
--conf spark.kubernetes.sgx.executor.mem=64g \
--conf spark.kubernetes.sgx.executor.jvm.mem=12g \
--conf spark.kubernetes.sgx.log.level=error \
--conf spark.authenticate=true \
--conf spark.authenticate.secret=$secure_password \
--conf spark.kubernetes.executor.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.kubernetes.driver.secretKeyRef.SPARK_AUTHENTICATE_SECRET="spark-secret:secret" \
--conf spark.authenticate.enableSaslEncryption=true \
--conf spark.network.crypto.enabled=true \
--conf spark.network.crypto.keyLength=128 \
--conf spark.network.crypto.keyFactoryAlgorithm=PBKDF2WithHmacSHA1 \
--conf spark.io.encryption.enabled=true \
--conf spark.io.encryption.keySizeBits=128 \
--conf spark.io.encryption.keygen.algorithm=HmacSHA1 \
--conf spark.ssl.enabled=true \
--conf spark.ssl.port=8043 \
--conf spark.ssl.keyPassword=$secure_password \
--conf spark.ssl.keyStore=/ppml/trusted-big-data-ml/work/keys/keystore.jks \
--conf spark.ssl.keyStorePassword=$secure_password \
--conf spark.ssl.keyStoreType=JKS \
--conf spark.ssl.trustStore=/ppml/trusted-big-data-ml/work/keys/keystore.jks \
--conf spark.ssl.trustStorePassword=$secure_password \
--conf spark.ssl.trustStoreType=JKS \
--class com.intel.analytics.bigdl.ppml.examples.SimpleQuerySparkExample \
--verbose \
--jars local:///ppml/trusted-big-data-ml/work/data/shansimu/ppml-e2e-examples/spark-encrypt-io/target/spark-encrypt-io-0.3.0-SNAPSHOT.jar \
local:///ppml/trusted-big-data-ml/work/data/shansimu/ppml-e2e-examples/spark-encrypt-io/target/spark-encrypt-io-0.3.0-SNAPSHOT.jar \
--inputPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted \
--outputPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/people_encrypted_output \
--inputPartitionNum 8 \
--outputPartitionNum 8 \
--inputEncryptModeValue AES/CBC/PKCS5Padding \
--outputEncryptModeValue AES/CBC/PKCS5Padding \
--primaryKeyPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/keys/primaryKey \
--dataKeyPath /ppml/trusted-big-data-ml/work/data/shansimu/simplequery/keys/dataKey \
--kmsType SimpleKeyManagementService \
--simpleAPPID 947536384638 \
--simpleAPPKEY 884926981201" > /ppml/trusted-big-data-ml/secured_argvs
./init.sh
gramine-sgx bash 2>&1 | tee spark-simplequery-sgx-driver-on-sgx.log
It seems like the encrypted file and the encrypted keys are not match, please try to generate a new encrypted file with your current primaryKey and dataKey.
thx @PatrickkZ It turned out to be a problem with my script
This error is because the encrypted file do not get decrypted, now, the encrypted file name has to end with .cbc
, this extension file name will trigger the decrypt process, so try to change your encrypted file name, Example, people.csv
to people.csv.cbc
. But the required input file name in SimpleQuerySparkExample is fixed to people.csv
, so you also need to modify SimpleQuerySparkExample' s code.
Error of running SimpleQuery example with bigdl-ppml-spark_3.1.2-2.1.0-20220907.120744-222-jar-with-dependencies.jar