Open Mike-Soukup opened 1 year ago
Hey @Mike-Soukup :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
Hi @Mike-Soukup , thanks for raising this issue.
Can you install other packages on your system in your current way?
You are on spark 3.0, and there might be some dependency issue with SynapseML 0.10.2.
[deleted, solution is not correct]
And if it still not works, can you try manually renaming the jars in their directory? This discussion might be useful as well: https://github.com/microsoft/SynapseML/issues/1374
@JessicaXYWang I ran into the same issues with the 0.9.5-13-d1b51517-SNAPSHOT
package.
I can try to manually rename the files. Is there any specific content that should be in them that provides functionality? Or do they just need to be present for whatever reason?
@Mike-Soukup It seems the jar is not installed. Can you help to check if you can install any other packages on your system?
@JessicaXYWang I was able to rename the .jar files on my system and got around the initial BUG. Now when I try to train my model, I get the following error: Is SynapseML not compatible with Spark 3.0.0? There seem to be dependency issues with the Java installation... Here my my clusters' jars...
@Mike-Soukup Thanks you for the feedback. The compatible versions are not correct on the website. Sorry for the confusion. The documentation will be fixed soon with this PR
I can't find a version that's compatible with Spark 3.0.0, @serena-ruan , do you have more information on it?
I would recommend use Spark 3.2 with the latest SynapseML 0.10.2 version. Or spark 3.1 with synapseml_2.12:0.9.5-13-d1b51517-SNAPSHOT
@JessicaXYWang Ok. That's unfortunate. But thank you for bringing clarity and guidance to this issue.
@JessicaXYWang I am running into this same issue attempting to use SynapseML 0.10.2 on Spark 3.3.1, so it seems like your suggestion to use a more recent Spark version will not/does not fix this issue.
@Mike-Soukup did you attempt to upgrade your Spark version per the suggestion, and if so, did it work?
Hi @flavajava thanks for raise this question.
We are aware the package management issue on spark 3.3 @KeerthiYandaOS can you share more information on this?
@flavajava 0.10.2 version is for Spark 3.2 cluster. Can you please use 0.10.1-69-84f5b579-SNAPSHOT
version for Spark 3.3.1. Support for Spark 3.3(.x) is still in progress so the website is not updated yet.
@KeerthiYandaOS I attempted to install the snapshot you indicated on a cluster running Spark 3.3.1 and it failed with the same reason that 0.10.2 failed (on Spark 3.3.1).
I have downgraded my Spark to 3.2.1 and was able to get 0.10.2 to install on it.
I will continue to track this issue to see when 0.10.2 is able to work on Spark 3.3.1.
In the meantime, I think it would be helpful if you made it unambiguously clear on the Installation page of your documentation site that Synapse will NOT work on Spark 3.3.x until it actually does work. This would have saved me a few days of chasing and would potentially save others time as well.
Thanks so much for your responsiveness to my post.
@flavajava Can you please share when and on which platform you are seeing the error for Spark3.3?
This was attempting to install SynapseML 0.10.2 on the Databricks ML Runtime 12.1 (which includes Python 3.9.5, Spark 3.3.1)
I believe this is the same same issue as #1817
@flavajava Because of the maven resolution change represented starting from DBR 11.0 we are seeing this issue. Can you please use spark.databricks.libraries.enableMavenResolution false
spark configuration on your cluster and try if that helps.
@KeerthiYandaOS not sure if something was changed either by y'all or if Databricks updated the runtime, but I just tried installing SynapseML 0.10.2 on the Databricks ML Runtime 12.1 (which includes Python 3.9.5, Spark 3.3.1) (the same one I was using a week or two ago) and the installation succeeded even without using spark.databricks.libraries.enableMavenResolution false Not sure what changed where, but I'll take it! Thank you for helping me work through this issue
@flavajava SynapseML 0.10.2 version if for Spark3.2 and 0.10.1-69-84f5b579-SNAPSHOT
version is for Spark 3.3.1(please make sure you are using appropriate SynapseML version). Not sure how it worked two weeks ago, but we haven't changed or deployed any changes to these versions. For DBR 11.0 and above (with spark3.3), you need spark.databricks.libraries.enableMavenResolution false
property to resolve the dependency issue.
Well what I'm seeing is that "For DBR 11.0 and above (with spark3.3), you need spark.databricks.libraries.enableMavenResolution false property to resolve the dependency issue." is no longer actually the case. Because I haven't set spark.databricks.libraries.enableMavenResolution to false and now the installation of 0.10.2 is working (on DBR 12.1 ML which has Spark 3.3.1)
Thank you @flavajava and @Mike-Soukup. Like @flavajava mentioned, DBR 11.0 doesn't require spark.databricks.libraries.enableMavenResolution false
anymore, I could install the jar without that property as well. We also have a new SynapseML version for Spark3.3 which overcomes the need for spark maven resolution property: com.microsoft.azure:synapseml_2.12:0.11.0-32-6085190e-SNAPSHOT
. Either way, we should be good with Spark3.3 on DBR 11.
Closing this issue as the solution is posted. Please feel free to open it if you are still facing errors. Thank you.
@KeerthiYandaOS - Do you know if there is a SNAPSHOT that works with DBR 12.2 ML? I haven't been able to load any of those JAR successfully.
I have the same problem @HoagieFestDS, and I haven´t been able to solve it. @KeerthiYandaOS any ideas?
Thank you
@antonquintela: we have com.microsoft.azure:synapseml_2.12:0.11.2 loaded on DBR 10.4, which worked. It's a deprecated runtime unfortunately, but it still works
Library installation attempted on the driver node of cluster 0214-173402-dkcgo9cm and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: Library resolution failed because com.linkedin.isolation-forest:isolation-forest_3.2.0_2.12 download failed.
SynapseML version
0.10.2
System information
Describe the problem
When trying to run the synapse.ml LightGBMClassifier, I receive the error:
Stack Trace:
I noticed this is an issue with having the correct .jar files, so I set up my configurations as such per the website documentation:
However, when I try to launch my Spark Cluster, I get the following error:
I checked my /home/notebook/.ivy2/jars directory and there is a com.microsoft.azure_onnx-protobuf_2.12-0.9.1-assembly.jar there.
I am not exactly sure where to go from here. I am not familiar with Java packages and dependencies, but my understanding is the assembly.jar should have all the files I need... I also tried using the 0.9.5-13-d1b51517-SNAPSHOT and got a similar error just missing a different file name. Please advise on how I can get these .jar files into my Spark Cluster so I can train my LightGBM model across my Spark Executors instead of just over the Spark Driver.
Code to reproduce issue
Other info / logs
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrations