Closed maxreis86 closed 2 years ago
Hello @maxreis86,
Thank you for reporting that. I was able to reproduce and analyse your issue. The reason behind it is a bit complicated classpath problem which will need more time to resolve.
In the meantime I would suggest a workaround:
jars
folder you will find a jar named (depending on the version) like this: sparkling-water-assembly-scoring_2.12-3.36.0.4-1-3.1-all.jar
Dependent JARs path
optionadditional-python-modules
parameter as isHi dear @krasinski,
I could run successfully following your step!
s3://bucket_name/sparkling-water-assembly-scoring_2.12-3.36.1.1-1-3.1-all.jar
Thank you so much for your help!!
Hello everybody,
I am try to using pysparkling.ml.H2OMOJOModel (h2o-pysparkling-3.1==3.36.0.4.post1) for predict a spark dataframe using a MOJO model trained with h2o==3.32.0.2 in AWS Glue Jobs, how ever a got the error: TypeError: 'JavaPackage' object is not callable.
I opened a ticket in AWS support and they confirmed that Glue environment is ok and the problem is probably with sparkling-water (pysparkling). It seems that some dependency library is missing, but I have no idea which one. The simple code bellow works perfectly if I run in my local computer (I only need to change the mojo path for GBM_grid__1_AutoML_20220323_233606_model_53.zip). The error occurred with any MOJO.zip file.
Could anyone ever run sparkling-water in Glue jobs successfully?
Job Details: -Glue version 3.0 --additional-python-modules, h2o-pysparkling-3.1==3.36.0.4.post1 -Worker type: G1.X -Number of workers: 2 -Using script "createFromMojo.py"
createFromMojo.py: