Closed Raidus closed 2 years ago
This should be fixed now
I am having the same problem on Databricks.
from pyspark.sql import SparkSession, types from pyspark.context import SparkConf, SparkContext conf = SparkConf() conf.set('spark.driver.extraClassPath', '/local_disk0/.ephemeral_nfs/envs/pythonEnv-f2bd0903-cb94-4620-9902-ef4520515c1b/lib/python3.8/site-packages/splink/jars/scala-udf-similarity-0.0.9.jar') # Not needed in spark 3 conf.set('spark.jars', '/local_disk0/.ephemeral_nfs/envs/pythonEnv-f2bd0903-cb94-4620-9902-ef4520515c1b/lib/python3.8/site-packages/splink/jars/scala-udf-similarity-0.0.9.jar') spark.udf.registerJavaFunction('jaro_winkler_sim','uk.gov.moj.dash.linkage.JaroWinklerSimilarity',types.DoubleType()) sc = SparkContext.getOrCreate(conf=conf) spark = SparkSession(sc)
AnalysisException: Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath
I have the same issue as above, also in Databricks.
Hi,
I am having problems running the example script
quickstart_demo_link_and_dedupe.ipnynb
.It gives the error:
Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath;
The files exists and I also tried to give the absolut path to file
scala-udf-similarity-0.0.6
but still the same behavior.Any ideas?