moj-analytical-services / splink_demos

Interactive notebooks containing demonstration code of the splink library
38 stars 27 forks source link

AnalysisException: Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath #20

Closed EdiLucy closed 2 years ago

EdiLucy commented 2 years ago

Having an error below when using Splink in Databricks

from pyspark.sql import SparkSession, types
from pyspark.context import SparkConf, SparkContext
import pyspark.sql.types as T

conf.set('spark.jars', '/databricks/python/lib/python3.8/site-packages/splink/jars/scala-udf-similarity-0.0.9.jar')
spark.udf.registerJavaFunction('jaro_winkler_sim', 'uk.gov.moj.dash.linkage.JaroWinklerSimilarity',T.DoubleType())

Error Message: AnalysisException: Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath

RobinL commented 2 years ago

Please see: https://github.com/moj-analytical-services/splink/issues/257