microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.05k stars 830 forks source link

Error installing synapseml #1824

Open shazriz opened 1 year ago

shazriz commented 1 year ago

SynapseML version

Installation issue

System information

Describe the problem

:: loading settings :: url = jar:file:/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /home/p105327_t1/.ivy2/cache The jars for the packages stored in: /home/p105327_t1/.ivy2/jars com.microsoft.azure#synapseml_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-0a41b358-7270-4203-b8f7-6917e3a1de6c;1.0 confs: [default] :: resolution report :: resolve 509261ms :: artifacts dl 1ms :: modules in use:

|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
---------------------------------------------------------------------

:: problems summary :: :::: WARNINGS module not found: com.microsoft.azure#synapseml_2.12;0.10.2 ==== local-m2-cache: tried file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar ==== local-ivy-cache: tried /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.10.2/ivys/ivy.xml -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.10.2/jars/synapseml_2.12.jar ==== central: tried https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar ==== spark-packages: tried https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.microsoft.azure#synapseml_2.12;0.10.2: not found :::::::::::::::::::::::::::::::::::::::::::::: :::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar (java.net.ConnectException: Connection timed out (Connection timed out)) :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.microsoft.azure#synapseml_2.12;0.10.2: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1429) at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:308) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Traceback (most recent call last): File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/code.py", line 90, in runcode exec(code, self.locals) File "", line 2, in File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/sql/session.py", line 228, in getOrCreate sc = SparkContext.getOrCreate(sparkConf) File "/lib/python3.7/site-packages/pyspark/context.py", line 384, in getOrCreate SparkContext(conf=conf or SparkConf()) File "/lib/python3.7/site-packages/pyspark/context.py", line 144, in init SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) File "/lib/python3.7/site-packages/pyspark/context.py", line 331, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway(conf) File "/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending its port number

Code to reproduce issue

import pyspark spark = pyspark.sql.SparkSession.builder.appName("MyApp").config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.2").getOrCreate()

Other info / logs

No response

What component(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions[bot] commented 1 year ago

Hey @shazriz :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.

serena-ruan commented 1 year ago

Hi @shazriz If you're using spark 3.1.2 please use version 0.9.5-13-d1b51517-SNAPSHOT. The installation guidance is here: https://microsoft.github.io/SynapseML/docs/getting_started/installation/

shazriz commented 1 year ago

Hey @serena-ruan I have also tried the version you mentioned above and I get a similar error.

:: loading settings :: url = jar:file:/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /home/p105327_t1/.ivy2/cache The jars for the packages stored in: /home/p105327_t1/.ivy2/jars com.microsoft.azure#synapseml_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-31d7ecd5-add7-408e-b30c-7c20846a1a6e;1.0 confs: [default] :: resolution report :: resolve 763889ms :: artifacts dl 0ms :: modules in use:

|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
---------------------------------------------------------------------

:: problems summary :: :::: WARNINGS module not found: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT ==== local-m2-cache: tried file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar ==== local-ivy-cache: tried /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/ivys/ivy.xml -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/jars/synapseml_2.12.jar ==== central: tried https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar ==== spark-packages: tried https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT: not found :::::::::::::::::::::::::::::::::::::::::::::: :::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/maven-metadata.xml (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/maven-metadata.xml (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar (java.net.ConnectException: Connection timed out (Connection timed out)) :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1429) at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:308) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Traceback (most recent call last): File "/lib/python3.7/code.py", line 90, in runcode exec(code, self.locals) File "", line 2, in File "/lib/python3.7/site-packages/pyspark/sql/session.py", line 228, in getOrCreate sc = SparkContext.getOrCreate(sparkConf) File "/lib/python3.7/site-packages/pyspark/context.py", line 384, in getOrCreate SparkContext(conf=conf or SparkConf()) File "/lib/python3.7/site-packages/pyspark/context.py", line 144, in init SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/context.py", line 331, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway(conf) File "/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending its port number

serena-ruan commented 1 year ago

Hi @shazriz Did you add our custom maven resolver? "spark.jars.repositories": "https://mmlspark.azureedge.net/maven" And from your logs it looks like an issue with your network: (java.net.ConnectException: Connection timed out (Connection timed out)

shazriz commented 1 year ago

Hey, @serena-ruan let me check and get back to you. Thanks for your response.

rbeldagarcia commented 1 year ago

Hello, I have a similar issue. I have tried to install following the code of the official web site in a conda environment with python 3.10 and pyspark 3.2:

`import pyspark spark = pyspark.sql.SparkSession.builder.appName("MyApp") \

Use 0.11.2-spark3.3 version for Spark3.3 and 0.11.2 version for Spark3.2

        .config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.11.2") \
        .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven") \
        .getOrCreate()

import synapse.ml`

And it reports de if report the error below. I have also tried with different maven repositories, using local jar files and trying to exclude the packages. I know that is not a issue caused by synapse buy I would appreciate some helps.

Thanks,

:::: WARNINGS [NOT FOUND ] org.apache.httpcomponents#httpcore;4.4.13!httpcore.jar (3ms)

==== local-m2-cache: tried

  file:/home/rbelda/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar

    [NOT FOUND  ] net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar (2ms)

==== local-m2-cache: tried

  file:/home/rbelda/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1-javadoc.jar

    ::::::::::::::::::::::::::::::::::::::::::::::

    ::              FAILED DOWNLOADS            ::

    :: ^ see resolution messages for details  ^ ::

    ::::::::::::::::::::::::::::::::::::::::::::::

    :: org.apache.httpcomponents#httpcore;4.4.13!httpcore.jar

    :: net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar

    ::::::::::::::::::::::::::::::::::::::::::::::