moj-analytical-services / splink_demos

Interactive notebooks containing demonstration code of the splink library
38 stars 27 forks source link

Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath; #8

Closed Raidus closed 2 years ago

Raidus commented 3 years ago

Hi,

I am having problems running the example script quickstart_demo_link_and_dedupe.ipnynb.

It gives the error: Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath;

The files exists and I also tried to give the absolut path to file scala-udf-similarity-0.0.6 but still the same behavior.

Any ideas?

RobinL commented 2 years ago

This should be fixed now

nsharkey commented 2 years ago

I am having the same problem on Databricks.

from pyspark.sql import SparkSession, types from pyspark.context import SparkConf, SparkContext conf = SparkConf() conf.set('spark.driver.extraClassPath', '/local_disk0/.ephemeral_nfs/envs/pythonEnv-f2bd0903-cb94-4620-9902-ef4520515c1b/lib/python3.8/site-packages/splink/jars/scala-udf-similarity-0.0.9.jar') # Not needed in spark 3 conf.set('spark.jars', '/local_disk0/.ephemeral_nfs/envs/pythonEnv-f2bd0903-cb94-4620-9902-ef4520515c1b/lib/python3.8/site-packages/splink/jars/scala-udf-similarity-0.0.9.jar') spark.udf.registerJavaFunction('jaro_winkler_sim','uk.gov.moj.dash.linkage.JaroWinklerSimilarity',types.DoubleType()) sc = SparkContext.getOrCreate(conf=conf) spark = SparkSession(sc)

AnalysisException: Can not load class uk.gov.moj.dash.linkage.JaroWinklerSimilarity, please make sure it is on the classpath

Screen Shot 2021-11-18 at 2 33 13 PM
EdiLucy commented 2 years ago

I have the same issue as above, also in Databricks.