microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.05k stars 830 forks source link

[BUG] Installation Troubles with Spark Submit #1827

Open DrVajiha opened 1 year ago

DrVajiha commented 1 year ago

SynapseML version

com.microsoft.azure:synapseml_2.12:0.10.2

System information

Describe the problem

I tried using synapse ML package through Pyspark code for using LightGBM module . The synapse ML package loaded and worked properly during December 2022. Now the Synapse ML package is not working and giving the error message "No module named '[synapse.ml]" as shown below in the Ubuntu system when run through spark submit.

But when I run the same code in windows environment using Python Idle shell it is working well and when I run the code in windows system by Spark submit it shows No Module found error.

why the same package is working in windows Python IDLE and not working in Spark submit and also in ubuntu OS?

Kindly help me to solve this issue to work with this package through spark submit in windows & Ubuntu OS.

ERROR MESSAGE: 12:03:52 :Package Loaded Finished Traceback (most recent call last): File "/home/mw-user/lgbmdemo.py", line 3, in import synapse.ml ModuleNotFoundError: No module named 'synapse.ml'

Code to reproduce issue

import pyspark spark = SparkSession.builder.appName("sample").config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.2") import synapse.ml from synapse.ml.lightgbm import * from synapse.ml.train import ComputeModelStatistics

Other info / logs

12:03:52 :Package Loaded Finished Traceback (most recent call last): File "/home/mw-user/lgbmdemo.py", line 3, in import synapse.ml ModuleNotFoundError: No module named 'synapse.ml'

What component(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions[bot] commented 1 year ago

Hey @DrVajiha :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.

serena-ruan commented 1 year ago

Hi @DrVajiha You need to run

spark = SparkSession.builder.appName("sample").config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.2").getOrCreate()

with the final action 'getOrCreate' in order to start the spark session. Otherwise it doesn't really download the package.

DrVajiha commented 1 year ago

Hi @serena-ruan , I have given getOrCreate in my code while running. I have missed this when typing this error report. Kindly help me to use synapse ML package in spark submit.

My Actual code: import pyspark spark = SparkSession.builder.appName("sample").config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.2")..config("spark.executor.memory","4g").config("spark.executor.cores","2").config("num-executors","4").config("spark.driver.memory","4g").getOrCreate() import synapse.ml from synapse.ml.lightgbm import * from synapse.ml.train import ComputeModelStatistics

serena-ruan commented 1 year ago

Could you paste your logs here (especially logs when starting spark session) so I could help debug on it? I could successfully install v0.10.2 on my ubuntu machine.

serena-ruan commented 1 year ago

And also please make sure you use spark3.2 for installing v0.10.2

DrVajiha commented 1 year ago

LOG: 23/02/06 12:03:47 WARN Utils: Your hostname, mwuser-Inspiron-15-3511 resolves to a loopback address: 127.0.1.1; using ...** instead (on interface wlp1s0) 23/02/06 12:03:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address Loading 12:03:49 :Package Loaded 23/02/06 12:03:49 INFO SparkContext: Running Spark version 3.3.1 23/02/06 12:03:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/02/06 12:03:50 INFO ResourceUtils: ============================================================== 23/02/06 12:03:50 INFO ResourceUtils: No custom resources configured for spark.driver. 23/02/06 12:03:50 INFO ResourceUtils: ============================================================== 23/02/06 12:03:50 INFO SparkContext: Submitted application: MyApp 23/02/06 12:03:50 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 4, script: , vendor: , memory -> name: memory, amount: 3072, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 23/02/06 12:03:50 INFO ResourceProfile: Limiting resource is cpus at 4 tasks per executor 23/02/06 12:03:50 INFO ResourceProfileManager: Added ResourceProfile id: 0 23/02/06 12:03:50 INFO SecurityManager: Changing view acls to: mw-user 23/02/06 12:03:50 INFO SecurityManager: Changing modify acls to: mw-user 23/02/06 12:03:50 INFO SecurityManager: Changing view acls groups to: 23/02/06 12:03:50 INFO SecurityManager: Changing modify acls groups to: 23/02/06 12:03:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mw-user); groups with view permissions: Set(); users with modify permissions: Set(mw-user); groups with modify permissions: Set() 23/02/06 12:03:50 INFO Utils: Successfully started service 'sparkDriver' on port 34941. 23/02/06 12:03:51 INFO SparkEnv: Registering MapOutputTracker 23/02/06 12:03:51 INFO SparkEnv: Registering BlockManagerMaster 23/02/06 12:03:51 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 23/02/06 12:03:51 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 23/02/06 12:03:51 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 23/02/06 12:03:51 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e4667682-c82b-4b2a-937d-16c984cf2e7e 23/02/06 12:03:51 INFO MemoryStore: MemoryStore started with capacity 2004.6 MiB 23/02/06 12:03:51 INFO SparkEnv: Registering OutputCommitCoordinator 23/02/06 12:03:51 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 23/02/06 12:03:51 INFO Utils: Successfully started service 'SparkUI' on port 4041. 23/02/06 12:03:51 INFO Executor: Starting executor ID driver on host .*.*. 23/02/06 12:03:51 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): '' 23/02/06 12:03:51 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45567. 23/02/06 12:03:51 INFO NettyBlockTransferService: Server created on ...**:45567 23/02/06 12:03:51 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 23/02/06 12:03:51 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.10.114, 45567, None) 23/02/06 12:03:51 INFO BlockManagerMasterEndpoint: Registering block manager .*.*.:45567 with 2004.6 MiB RAM, BlockManagerId(driver, ...**,45567, None) 23/02/06 12:03:51 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, .*.*., 45567, None) 23/02/06 12:03:51 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, ...**, 45567, None) 12:03:52 :Package Loaded Finished Traceback (most recent call last): File "/home/mw-user/ldbmdemo.py", line 26, in import synapse.ml ModuleNotFoundError: No module named 'synapse.ml' 23/02/06 12:03:53 INFO SparkContext: Invoking stop() from shutdown hook 23/02/06 12:03:53 INFO SparkUI: Stopped Spark web UI at http:/.*.*.:4041 23/02/06 12:03:53 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 23/02/06 12:03:53 INFO MemoryStore: MemoryStore cleared 23/02/06 12:03:53 INFO BlockManager: BlockManager stopped 23/02/06 12:03:53 INFO BlockManagerMaster: BlockManagerMaster stopped 23/02/06 12:03:53 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 23/02/06 12:03:53 INFO SparkContext: Successfully stopped SparkContext 23/02/06 12:03:53 INFO ShutdownHookManager: Shutdown hook called 23/02/06 12:03:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-7c99ba75-d1c1-44f5-b033-f6818d4d0a35 23/02/06 12:03:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-36be6708-b798-4fb7-97a1-f8e1a571f64e 23/02/06 12:03:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-7c99ba75-d1c1-44f5-b033-f6818d4d0a35/pyspark-5b32c82d-ab0b-47ad-a47d-e3fd5d4cdf7f

serena-ruan commented 1 year ago

Hi @DrVajiha I see that you're using spark 3.3, but our version v0.10.2 is actually for spark3.2, could you try installing spark3.2 instead?

mhamilton723 commented 1 year ago

@DrVajiha i also dont see any installation logs in the logs you sent. If you are using Spark submit please consider using the instructions here: https://github.com/microsoft/SynapseML#spark-submit

DrVajiha commented 1 year ago

@serena-ruan,@mhamilton723 I have installed spark version to 3.2.2 and tried installing lgbm synapse package using spark submit. Still I'm facing the same issue "ModuleNotFoundError". import synapse.ml ModuleNotFoundError: No module named 'synapse.ml'

ppruthi commented 1 year ago

@DrVajiha -- I do not see the jar download actually happening in the logs you shared. It seems that somehow the jars are not getting downloaded in your environment which is why you're seeing the module not found. The logs should spit out something like:

com.microsoft.azure#synapseml_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-8d72dfbf-69d4-4ca9-b101-d0f871f39a16;1.0
        confs: [default]
        found com.microsoft.azure#synapseml_2.12;0.10.2 in central <<<<----- PACKAGE FOUND
        found com.microsoft.azure#synapseml-core_2.12;0.10.2 in central <<<<----- PACKAGE FOUND
        found org.scalactic#scalactic_2.12;3.2.14 in central
        found org.scala-lang#scala-reflect;2.12.15 in central
        found io.spray#spray-json_2.12;1.3.5 in central
        found com.jcraft#jsch;0.1.54 in central
        found org.apache.httpcomponents.client5#httpclient5;5.1.3 in central
        found org.apache.httpcomponents.core5#httpcore5;5.1.3 in central
        found org.apache.httpcomponents.core5#httpcore5-h2;5.1.3 in central
        found org.slf4j#slf4j-api;1.7.25 in central
        found commons-codec#commons-codec;1.15 in central
        found org.apache.httpcomponents#httpmime;4.5.13 in central
        found org.apache.httpcomponents#httpclient;4.5.13 in central
        found org.apache.httpcomponents#httpcore;4.4.13 in central
        found commons-logging#commons-logging;1.2 in central
        found com.linkedin.isolation-forest#isolation-forest_3.2.0_2.12;2.0.8 in central
        found com.chuusai#shapeless_2.12;2.3.2 in central
        found org.typelevel#macro-compat_2.12;1.1.1 in central
        found org.apache.spark#spark-avro_2.12;3.2.0 in central
        found org.tukaani#xz;1.8 in central
        found org.spark-project.spark#unused;1.0.0 in central
        found org.testng#testng;6.8.8 in central
        found org.beanshell#bsh;2.0b4 in central
        found com.beust#jcommander;1.27 in central
        found com.microsoft.azure#synapseml-deep-learning_2.12;0.10.2 in central
        found com.microsoft.azure#synapseml-opencv_2.12;0.10.2 in central
        found org.openpnp#opencv;3.2.0-1 in central
        found com.microsoft.azure#onnx-protobuf_2.12;0.9.1 in central
        found com.microsoft.cntk#cntk;2.4 in central
        found com.microsoft.onnxruntime#onnxruntime_gpu;1.8.1 in central
        found com.microsoft.azure#synapseml-cognitive_2.12;0.10.2 in central
        found com.microsoft.cognitiveservices.speech#client-jar-sdk;1.14.0 in central
        found com.microsoft.azure#synapseml-vw_2.12;0.10.2 in central
        found com.github.vowpalwabbit#vw-jni;8.9.1 in central
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar (18ms) <<<<----- PACKAGE INSTALLED
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-core_2.12/0.10.2/synapseml-core_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml-core_2.12;0.10.2!synapseml-core_2.12.jar (490ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-deep-learning_2.12/0.10.2/synapseml-deep-learning_2.12-0.10.2.jar ... <<<<----- PACKAGE INSTALLED
        [SUCCESSFUL ] com.microsoft.azure#synapseml-deep-learning_2.12;0.10.2!synapseml-deep-learning_2.12.jar (86ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-cognitive_2.12/0.10.2/synapseml-cognitive_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml-cognitive_2.12;0.10.2!synapseml-cognitive_2.12.jar (487ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-vw_2.12/0.10.2/synapseml-vw_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml-vw_2.12;0.10.2!synapseml-vw_2.12.jar (84ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-lightgbm_2.12/0.10.2/synapseml-lightgbm_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml-lightgbm_2.12;0.10.2!synapseml-lightgbm_2.12.jar (111ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/synapseml-opencv_2.12/0.10.2/synapseml-opencv_2.12-0.10.2.jar ...
        [SUCCESSFUL ] com.microsoft.azure#synapseml-opencv_2.12;0.10.2!synapseml-opencv_2.12.jar (28ms)
downloading https://repo1.maven.org/maven2/com/microsoft/azure/onnx-protobuf_2.12/0.9.1/onnx-protobuf_2.12-0.9.1-assembly.jar ...
        [SUCCESSFUL ] com.microsoft.azure#onnx-protobuf_2.12;0.9.1!onnx-protobuf_2.12.jar (392ms)
downloading https://repo1.maven.org/maven2/com/microsoft/cntk/cntk/2.4/cntk-2.4.jar ...
        [SUCCESSFUL ] com.microsoft.cntk#cntk;2.4!cntk.jar (53122ms)
downloading https://repo1.maven.org/maven2/com/microsoft/onnxruntime/onnxruntime_gpu/1.8.1/onnxruntime_gpu-1.8.1.jar ...
        [SUCCESSFUL ] com.microsoft.onnxruntime#onnxruntime_gpu;1.8.1!onnxruntime_gpu.jar (32420ms)
downloading https://repo1.maven.org/maven2/org/openpnp/opencv/3.2.0-1/opencv-3.2.0-1.jar ...
        [SUCCESSFUL ] org.openpnp#opencv;3.2.0-1!opencv.jar(bundle) (17325ms)
downloading https://repo1.maven.org/maven2/com/microsoft/cognitiveservices/speech/client-jar-sdk/1.14.0/client-jar-sdk-1.14.0.jar ...
        [SUCCESSFUL ] com.microsoft.cognitiveservices.speech#client-jar-sdk;1.14.0!client-jar-sdk.jar (2804ms)
downloading https://repo1.maven.org/maven2/com/github/vowpalwabbit/vw-jni/8.9.1/vw-jni-8.9.1.jar ...
        [SUCCESSFUL ] com.github.vowpalwabbit#vw-jni;8.9.1!vw-jni.jar (990ms)
downloading https://repo1.maven.org/maven2/com/microsoft/ml/lightgbm/lightgbmlib/3.2.110/lightgbmlib-3.2.110.jar ...
        [SUCCESSFUL ] com.microsoft.ml.lightgbm#lightgbmlib;3.2.110!lightgbmlib.jar (682ms)
:: resolution report :: resolve 2718ms :: artifacts dl 109050ms
        :: modules in use:
        com.beust#jcommander;1.27 from central in [default]
        com.chuusai#shapeless_2.12;2.3.2 from central in [default]
        com.github.vowpalwabbit#vw-jni;8.9.1 from central in [default]
        com.jcraft#jsch;0.1.54 from central in [default]
        com.linkedin.isolation-forest#isolation-forest_3.2.0_2.12;2.0.8 from central in [default]
        com.microsoft.azure#onnx-protobuf_2.12;0.9.1 from central in [default]
        com.microsoft.azure#synapseml-cognitive_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml-core_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml-deep-learning_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml-lightgbm_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml-opencv_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml-vw_2.12;0.10.2 from central in [default]
        com.microsoft.azure#synapseml_2.12;0.10.2 from central in [default]
        com.microsoft.cntk#cntk;2.4 from central in [default]
        com.microsoft.cognitiveservices.speech#client-jar-sdk;1.14.0 from central in [default]
        com.microsoft.ml.lightgbm#lightgbmlib;3.2.110 from central in [default]
        com.microsoft.onnxruntime#onnxruntime_gpu;1.8.1 from central in [default]
        commons-codec#commons-codec;1.15 from central in [default]
        commons-logging#commons-logging;1.2 from central in [default]
        io.spray#spray-json_2.12;1.3.5 from central in [default]
        org.apache.httpcomponents#httpclient;4.5.13 from central in [default]
        org.apache.httpcomponents#httpcore;4.4.13 from central in [default]
        org.apache.httpcomponents#httpmime;4.5.13 from central in [default]
        org.apache.httpcomponents.client5#httpclient5;5.1.3 from central in [default]
        org.apache.httpcomponents.core5#httpcore5;5.1.3 from central in [default]
        org.apache.httpcomponents.core5#httpcore5-h2;5.1.3 from central in [default]
        org.apache.spark#spark-avro_2.12;3.2.0 from central in [default]
        org.beanshell#bsh;2.0b4 from central in [default]
        org.openpnp#opencv;3.2.0-1 from central in [default]
        org.scala-lang#scala-reflect;2.12.15 from central in [default]
        org.scalactic#scalactic_2.12;3.2.14 from central in [default]
        org.slf4j#slf4j-api;1.7.25 from central in [default]
        org.spark-project.spark#unused;1.0.0 from central in [default]
        org.testng#testng;6.8.8 from central in [default]
        org.tukaani#xz;1.8 from central in [default]
        org.typelevel#macro-compat_2.12;1.1.1 from central in [default]
        :: evicted modules:
        commons-codec#commons-codec;1.11 by [commons-codec#commons-codec;1.15] in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   37  |   14  |   14  |   1   ||   36  |   14  |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-8d72dfbf-69d4-4ca9-b101-d0f871f39a16
        confs: [default]
        14 artifacts copied, 22 already retrieved (404832kB/473ms)

Could you share/paste the entire logs when you tried the pyspark code @mhamilton723 mentioned to see if the jar package downloaded ?

ZeCosta commented 9 months ago

I'm sorry to intrude, but I have somewhat the same problem. Some jars are downloaded, but never the ones I specify in the SparkSession config, no matter the jars I specify. Do you know what the problem might be?