Closed karpanGit closed 1 year ago
I feel the logistic regression and the ECM example code need both to the updated to reflect the latest API.
I managed to get the logistic regression working with
df_a = pd.DataFrame({'name':['Panos', 'George', 'Maria', 'Panos'], 'age':[10, 20, 30, 40]}, index=['a1', 'a2', 'a3', 'a4']) df_b = pd.DataFrame({'name':['Panoz', 'Georgi', 'Maria', 'Panos'], 'age':[11, 22, 33, 40]}, index=['b1', 'b2', 'b3', 'b4']) indexer = recordlinkage.Index() # indexer.block('name') indexer.full() # uniqueness of indexes is ensured candidate_links = indexer.index(df_a, df_b) compare = recordlinkage.Compare() compare.string('name', 'name', method='jarowinkler', threshold=0.85) compare.numeric('age', 'age') compare_vectors = compare.compute(candidate_links, df_a, df_b) # fit a logistic regression classifier true_linkage = pd.Series(np.where((compare_vectors[0]>=1.) & (compare_vectors[1]<=0.5), 'same', 'different'), index=compare_vectors.index) logrg = recordlinkage.LogisticRegressionClassifier() logrg.fit(compare_vectors, true_linkage[true_linkage=='same'].index)
For the ECM the class BernoulliEMCClassifier does not seem to exist. Do you mean the ECM class?
I feel the logistic regression and the ECM example code need both to the updated to reflect the latest API.
I managed to get the logistic regression working with
For the ECM the class BernoulliEMCClassifier does not seem to exist. Do you mean the ECM class?