intel-analytics / analytics-zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
https://analytics-zoo.readthedocs.io/
Apache License 2.0
17 stars 4 forks source link

PPML document needs to be refined and corrected #115

Open jenniew opened 3 years ago

jenniew commented 3 years ago

PPML user guide is not clear and lack of some essential information. And also there are errors for the scripts. For example, For 2.1 it runs ./ppml/scripts/generate-keys.sh, it will run keytool. But no information mentioned about how to install and run keytool command. And when run ./ppml/scripts/generate-keys.sh, it asks for input several passwords, but not sure what are these password for, which passwords should be match. And when run ./ppml/scripts/generate-keys.sh, it get error: base64: ./keys/keystore.jks: No such file or directory base64: ./keys/keystore.pkcs12: No such file or directory base64: ./keys/server.pem: No such file or directory base64: ./keys/server.crt: No such file or directory base64: ./keys/server.csr: No such file or directory base64: ./keys/server.key: No such file or directory.

2.3.1. Run ./build-docker-image.sh, if no proxy setting needed, the script would be fail. Should describe how to run if no proxy needed.

2.3.2.1. cp -r ../keys . this command cannot be executed as the directory is not correct. should be cp -r ../../../../keys .

To start the container, first modify the paths in deploy-local-spark-sgx.sh. No information about what should be set for the environment: export ENCLAVE_KEY_PATH=YOUR_LOCAL_ENCLAVE_KEY_PATH export DATA_PATH=YOUR_LOCAL_DATA_PATH export KEYS_PATH=YOUR_LOCAL_KEYS_PATH export LOCAL_IP=YOUR_LOCAL_IP

When run ./deploy-local-spark-sgx.sh would be failed. Get error: Unable to find image '10.239.45.10/arda/intelanalytics/analytics-zoo-ppml-trusted-big-data-ml-python-graphene:0.11-SNAPSHOT' locally

Haven't finish all steps. but so many unclear things and errors. Hard to follow. Please check all the steps and fix.

glorysdj commented 3 years ago

@ManfeiBai please help to fix the issues.

ManfeiBai commented 3 years ago

ok, I would fix it now

ManfeiBai commented 3 years ago

This following error has been fixed in the new PR(https://github.com/intel-analytics/analytics-zoo/pull/4617), and the path has been updated, the other errors are processing "And when run ./ppml/scripts/generate-keys.sh, it get error: base64: ./keys/keystore.jks: No such file or directory base64: ./keys/keystore.pkcs12: No such file or directory base64: ./keys/server.pem: No such file or directory base64: ./keys/server.crt: No such file or directory base64: ./keys/server.csr: No such file or directory base64: ./keys/server.key: No such file or directory."

jenniew commented 3 years ago

When run "bash work/start-scripts/start-spark-local-sql-sgx.sh", get this error: py4j.protocol.Py4JJavaError: An error occurred while calling o32.json. : org.apache.spark.sql.AnalysisException: Path does not exist: file:/examples/src/main/resources/people.json; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:558) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:545) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211) at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:392) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 211, in basic_df_example(spark) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 42, in basic_df_example df = spark.read.json("examples/src/main/resources/people.json") File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 274, in json File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line

jenniew commented 3 years ago

When run "bash work/start-scripts/start-spark-local-sql-sgx.sh", also get this error: 21/09/01 22:24:33 INFO DAGScheduler: Job 13 failed: runJob at PythonRDD.scala:153, took 453.136422 s Traceback (most recent call last): File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 212, in schema_inference_example(spark) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 152, in schema_inference_example schemaPeople = spark.createDataFrame(people) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 746, in createDataFrame File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 390, in _createFromRDD File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 361, in _inferSchema File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1390, in first File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1372, in take File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/context.py", line 1069, in runJob File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 18.0 failed 1 times, most recent failure: Lost task 0.0 in stage 18.0 (TID 209, localhost, executor driver): java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:134) at java.io.DataOutputStream.writeInt(DataOutputStream.java:198) at org.apache.spark.security.SocketAuthHelper.writeUtf8(SocketAuthHelper.scala:112) at org.apache.spark.security.SocketAuthHelper.authToServer(SocketAuthHelper.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:115) at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:133) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:125) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:109) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346) at org.apache.spark.rdd.RDD.iterator(RDD.scala:310) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

glorysdj commented 3 years ago

When run "bash work/start-scripts/start-spark-local-sql-sgx.sh", get this error: py4j.protocol.Py4JJavaError: An error occurred while calling o32.json. : org.apache.spark.sql.AnalysisException: Path does not exist: file:/examples/src/main/resources/people.json; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:558) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:355) at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:545) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211) at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:392) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 211, in basic_df_example(spark) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 42, in basic_df_example df = spark.read.json("examples/src/main/resources/people.json") File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 274, in json File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line

Path does not exist... please correct the file path

glorysdj commented 3 years ago

When run "bash work/start-scripts/start-spark-local-sql-sgx.sh", also get this error: 21/09/01 22:24:33 INFO DAGScheduler: Job 13 failed: runJob at PythonRDD.scala:153, took 453.136422 s Traceback (most recent call last): File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 212, in schema_inference_example(spark) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 152, in schema_inference_example schemaPeople = spark.createDataFrame(people) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 746, in createDataFrame File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 390, in _createFromRDD File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 361, in _inferSchema File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1390, in first File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1372, in take File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/context.py", line 1069, in runJob File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 18.0 failed 1 times, most recent failure: Lost task 0.0 in stage 18.0 (TID 209, localhost, executor driver): java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:134) at java.io.DataOutputStream.writeInt(DataOutputStream.java:198) at org.apache.spark.security.SocketAuthHelper.writeUtf8(SocketAuthHelper.scala:112) at org.apache.spark.security.SocketAuthHelper.authToServer(SocketAuthHelper.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:115) at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:133) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:125) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:109) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346) at org.apache.spark.rdd.RDD.iterator(RDD.scala:310) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

please fix the previous one? then try it again and did you pass the java -version test on SGX?

jenniew commented 3 years ago

When run "bash work/start-scripts/start-spark-local-sql-sgx.sh", also get this error: 21/09/01 22:24:33 INFO DAGScheduler: Job 13 failed: runJob at PythonRDD.scala:153, took 453.136422 s Traceback (most recent call last): File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 212, in schema_inference_example(spark) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/examples/src/main/python/sql/basic.py", line 152, in schema_inference_example schemaPeople = spark.createDataFrame(people) File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 746, in createDataFrame File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 390, in _createFromRDD File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/session.py", line 361, in _inferSchema File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1390, in first File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/rdd.py", line 1372, in take File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/context.py", line 1069, in runJob File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/ppml/trusted-big-data-ml/work/spark-2.4.6/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 18.0 failed 1 times, most recent failure: Lost task 0.0 in stage 18.0 (TID 209, localhost, executor driver): java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:134) at java.io.DataOutputStream.writeInt(DataOutputStream.java:198) at org.apache.spark.security.SocketAuthHelper.writeUtf8(SocketAuthHelper.scala:112) at org.apache.spark.security.SocketAuthHelper.authToServer(SocketAuthHelper.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:115) at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:133) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:125) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:109) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346) at org.apache.spark.rdd.RDD.iterator(RDD.scala:310) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

please fix the previous one? then try it again and did you pass the java -version test on SGX?

I fixed the previous path issue manually, and tried the this script then got this error. The first test "basic_df_example" passed, the second test "schema_inference_example" failed with this error. Yes, java -version test passed.

ManfeiBai commented 3 years ago

This error has been fixed in the new PR( https://github.com/intel-analytics/analytics-zoo/pull/4627 ) by interaction and prompt information: "2.3.1. Run ./build-docker-image.sh, if no proxy setting needed, the script would be fail. Should describe how to run if no proxy needed."

This error has been fixed in the new PR( https://github.com/intel-analytics/analytics-zoo/pull/4627 ) and new doc: "2.3.2.1. cp -r ../keys . this command cannot be executed as the directory is not correct. should be cp -r ../../../../keys ."

This error has been fixed in the new PR( https://github.com/intel-analytics/analytics-zoo/pull/4627 ) and new doc: "To start the container, first modify the paths in deploy-local-spark-sgx.sh. No information about what should be set for the environment: export ENCLAVE_KEY_PATH=YOUR_LOCAL_ENCLAVE_KEY_PATH export DATA_PATH=YOUR_LOCAL_DATA_PATH export KEYS_PATH=YOUR_LOCAL_KEYS_PATH export LOCAL_IP=YOUR_LOCAL_IP"

In the newest analytics-zoo version, the images's version would be under "intelanalytics/", rather than "10.239.45.10/": When run ./deploy-local-spark-sgx.sh would be failed. Get error: Unable to find image '10.239.45.10/arda/intelanalytics/analytics-zoo-ppml-trusted-big-data-ml-python-graphene:0.11-SNAPSHOT' locally

The other errors are processing.

ManfeiBai commented 3 years ago

New fix of "keytool: command not found" has been added in the PR intel-analytics/analytics-zoo#4627, please follow this doc to do the Prerequisite and create "keys" and "password": https://github.com/ManfeiBai/analytics-zoo/blob/patch-12/docs/readthedocs/source/doc/PPML/Overview/ppml.md#21-prerequisite

jenniew commented 3 years ago

When run ./deploy-distributed-standalone-spark.sh, it uses root user. But actually on Azure VM, no root user can be used. Can we provide a deploy script which uses non-root sudo user?

ManfeiBai commented 3 years ago

deploy-distributed-standalone-spark.sh

Could we use "sudo deploy-distributed-standalone-spark.sh" to run the script on Azure VM, rather than use "./deploy-distributed-standalone-spark.sh"?

jenniew commented 3 years ago

distributed-check-status.sh also need to support non-root user

jenniew commented 3 years ago

When run work load on cluster, we need to replace --master 'local[4]' with such lines:

--master 'spark://your_master_url' \
--conf spark.authenticate=true \
--conf spark.authenticate.secret=your_secret_key \

What should your_secret_key to be set to?

ManfeiBai commented 3 years ago

Thanks, solving it now