Open devmcp opened 2 years ago
Pandas datatypes, such as pd.Int64Dtype (see here), do not seem to be supported:
pd.Int64Dtype
import recordlinkage from recordlinkage.datasets import load_febrl4 dfA, dfB = load_febrl4() # Convert column types to pandas nullable integer (Int64): dfA.postcode = pd.to_numeric(dfA.postcode).convert_dtypes() dfB.postcode = pd.to_numeric(dfB.postcode).convert_dtypes() # Indexation step indexer = recordlinkage.Index() indexer.block("given_name") candidate_links = indexer.index(dfA, dfB) # Comparison step compare_cl = recordlinkage.Compare() compare_cl.numeric("postcode", "postcode", label="postcode") features = compare_cl.compute(candidate_links, dfA, dfB)
gives the error:
TypeError: Cannot interpret 'Int64Dtype()' as a data type
Pandas datatypes, such as
pd.Int64Dtype
(see here), do not seem to be supported:gives the error: