databrickslabs / mosaic

An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
https://databrickslabs.github.io/mosaic/
Other
278 stars 67 forks source link

py4j.Py4JException: Constructor com.databricks.libraries.JavaJarId([class java.net.URI, class java.lang.String, class java.lang.String] #430

Open AbishekS01 opened 1 year ago

AbishekS01 commented 1 year ago

https://databrickslabs.github.io/mosaic/#project-support

which says that Mosaic should still work on "Photon enabled" clusters.

We are currently running several Mosaic jobs on Photon-enabled clusters, but after an upgrade to runtime 13.3, we got an error:

py4j.Py4JException: Constructor com.databricks.libraries.JavaJarId([class java.net.URI, class java.lang.String, class java.lang.String])

I will be downgrading the cluster back to 13.2 to see if that fixes it in the short term, but I would like guidance on how to use mosaic GEO features (most importantly st_geomfromgeojson and st_intersects, etc).

landlord-matt commented 1 year ago

We got the same error running mosaic==0.3.5 on runtime version 13.3 on a Personal Compute. This used to work on 12.2 LTS. Upgrading to mosaic==0.3.11 does not make any difference. Not that I think it matters, but we're running Databricks in Azure.

enable_mosaic(spark, dbutils)

MosaicLibraryHandler(config.mosaic_spark)

     71 converters = self.sc._jvm.scala.collection.JavaConverters
     73 JarURI = JavaURI.create("file:" + self._jar_path)
---> 74 lib = JavaJarId(
     75     JarURI,
     76     ManagedLibraryId.defaultOrganization(),
     77     NoVersionModule.simpleString(),
     78 )
py4j.Py4JException: Constructor com.databricks.libraries.JavaJarId([class java.net.URI, class java.lang.String, class java.lang.String]) does not exist
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:203)
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:220)
    at py4j.Gateway.invoke(Gateway.java:255)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
    at java.lang.Thread.run(Thread.java:750)
landlord-matt commented 1 year ago

I realise now that it says that Databricks Runtime 13 isn't supported until mosaic version 0.4 in the README. That was added on June 13, almost 3 months ago. It seems the project stalled as 13.3 is now the current LTS?

landlord-matt commented 1 year ago

They now answered in this thread that they will fix this by the end of October 2023.