Closed helenxl closed 8 months ago
hey @helenxl this error means python cannot find the glow jars
Did you install just the pypi package?
Glow also requires the jars that come from Maven coordinates.
In this case, io.projectglow:glow-spark3_2.12:1.1.1
What environment are you doing this in? Is it in Databricks or another Spark service or rolling your own Spark?
Thanks! I missed that requirement.
no sweat, I forgot too first time I installed glow via pypi and maven
we also have docker containers that contain all the jars and the pypi package.
https://hub.docker.com/u/projectglow
On Databricks you can install via Databricks container services, for Glow v1.1.1 you would point to this Docker Image URL
projectglow/databricks-glow:1.1.1
hey @helenxl this error means python cannot find the glow jars
Did you install just the pypi package? Glow also requires the jars that come from Maven coordinates. In this case,
io.projectglow:glow-spark3_2.12:1.1.1
What environment are you doing this in? Is it in Databricks or another Spark service or rolling your own Spark?
Hi! May I ask what should I do in the jupyter notebook? I come across the similar problem...
import findspark
import pyspark
import glow
from pyspark.sql import SparkSession
findspark.init()
spark = SparkSession.builder.getOrCreate()
spark = glow.register(spark)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-15-079acb31ab5e> in <module>
1 import glow
----> 2 spark = glow.register(spark)
C:\anaconda\lib\site-packages\glow\glow.py in register(session, new_session)
78 sc = session._sc
79 return SparkSession(
---> 80 sc, session._jvm.io.projectglow.Glow.register(session._jsparkSession, new_session))
81
82
TypeError: 'JavaPackage' object is not callable
@williambrandler
When specifying projectglow/databricks-glow:1.1.1
, the databricks cluster encountered an error pulling the image. I can pull the image using docker cli fine. Do you know what may be missing?
Thank you.
Cluster terminated.Reason:Docker image pull failure
Cannot launch the cluster because pulling the docker image failed. Please double check connectivity from workers to the container registry, as well as the credentials used to pull the image.
Internal error message: Container setup failed due to a docker image pull failure: Image doesn't exist or invalid credential to pull image from projectglow/databricks-glow:1.1.1 .
Stdout:
Stderr: time="2021-12-02T16:43:57Z" level=fatal msg="Error parsing image name \"docker://projectglow/databricks-glow:1.1.1 \": invalid reference format"
@helenxl Did you get the issue resolved?
missed this, please share more information (such as a screenshot of cluster setup) @helenxl @Tabinda788
On Tue, Mar 22, 2022 at 5:08 AM Tabinda @.***> wrote:
@helenxl https://github.com/helenxl Did you get the issue resolved?
— Reply to this email directly, view it on GitHub https://github.com/projectglow/glow/issues/456#issuecomment-1075095832, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMGPEIZNYFUDI4EBZJKZVTLVBGZ4ZANCNFSM5JFDJLLQ . You are receiving this because you were mentioned.Message ID: @.***>
Yes, please go ahead to close this issue. I was able to use projectglow in Databricks.
@helenxl Can we make it work on local?
@Tabinda788 would docker work for you, @edg1983 contributed a Dockerfile for running glow outside of databricks, which we have put on the projectglow dockerhub and could be run via docker on local?
https://github.com/projectglow/glow/issues/494 https://github.com/projectglow/glow/pull/503 https://hub.docker.com/r/projectglow/open-source-glow
@Tabinda788 The fix is the same locally -- you need to install the maven library.
I am running an example notebook from Databricks. I have installed glow version 1.1.1 for this cluster. I am encountering an error with
glow.register(spark)
.What am I missing?