issues
search
J535D165
/
recordlinkage
A powerful and modular toolkit for record linkage and duplicate detection in Python
http://recordlinkage.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
966
stars
152
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How do I perform deduplication with the python record linkage toolkit with large data sets?
#207
sidhugithub1
opened
3 months ago
0
Changing norms of comparison functions
#205
JosephKuchar
opened
5 months ago
0
recordlinkage.NaiveBayesClassifier() fit returns multiindex of all feature pairs
#204
MWiggins
opened
6 months ago
0
Avoid np.log of zero in ECM
#203
emuccino
opened
9 months ago
0
Length mismatch at
#202
TongmengXie
opened
11 months ago
0
automatically check how many components are defined in rl.Compare()
#201
bergen288
opened
11 months ago
0
Duplicated matching columns with rl_comparer.compute while looping over zip code
#200
bergen288
closed
11 months ago
2
Add pre-commit hooks
#199
J535D165
closed
1 year ago
0
Update the docs CI pipeline
#198
J535D165
closed
1 year ago
0
Update CI docs generation and CI pipeline
#197
J535D165
closed
1 year ago
0
Lint with Ruff and format with Black
#196
J535D165
closed
1 year ago
0
Replace setup.py by pyproject.toml
#195
J535D165
closed
1 year ago
0
Address Matching Conditional on value of another column
#194
konsbn
opened
1 year ago
1
`ECMClassifier` returns almost all candidate pairs
#193
Evnsn
opened
1 year ago
2
Add support for pandas==2
#192
J535D165
closed
1 year ago
0
Fix usage examples
#190
martinhohoff
closed
1 year ago
2
add threshold None and label docstrings for String
#189
davidggphy
closed
1 year ago
0
Indexing - performance warning - full index can result in a large number of pairs
#187
gajghaten
opened
1 year ago
3
Fix links
#186
andyjessen
closed
1 year ago
0
update of the introduction
#185
karpanGit
closed
1 year ago
0
Fix typo
#184
havardox
closed
1 year ago
0
Candidate pairs issue
#183
Shivamkumar285
opened
2 years ago
0
For when support for packages like Dask or Ray (or Modin)?
#182
ialvata
opened
2 years ago
0
Possible bug with _dedup_index when df has only 1 row.
#181
IavTavares
opened
2 years ago
0
missing value is not working and it is default to 0 even if we change the value.
#180
selva221724
opened
2 years ago
1
Support for pandas datatypes
#179
devmcp
opened
2 years ago
0
How to utilize prob-related methods of ECM classifier
#178
Ramin1368
opened
2 years ago
0
AttributeError: module 'recordlinkage' has no attribute 'SortedNeighbourhoodIndex'
#176
naeemahaz
opened
2 years ago
1
Data Corruptors a la GeCO
#175
aflaxman
opened
2 years ago
0
Make use of nbsphinx for documentation and guides
#174
J535D165
closed
2 years ago
1
Remove deprecated recordlinkage classes
#173
J535D165
closed
2 years ago
0
fastparquet 0.8.1: writing dataframe to parquet file from a table data field with rtf doc content falls with TypeError exception
#172
PavelD0770
opened
2 years ago
0
Bump min Python version to 3.6, ideally 3.8+
#171
J535D165
closed
2 years ago
0
Fix various deprecation warnings and broken docs build
#170
J535D165
closed
2 years ago
0
fixing failed build-docs action
#169
twalen
closed
2 years ago
1
fixing broken build and removed some warnings
#168
twalen
closed
2 years ago
1
optimize Performance ?
#167
jigar-prajapati18
opened
2 years ago
0
What languages are supported by this toolkit? only English?
#166
yoeldk
opened
2 years ago
0
compare.date
#165
yishairasowsky
opened
2 years ago
0
Update ref-compare.rst
#164
hwong557
closed
2 years ago
1
Update ref-compare.rst
#163
hwong557
closed
2 years ago
1
missing values
#162
yishaistreamline
opened
2 years ago
4
threshold in at compere is broken
#161
skuam
opened
3 years ago
0
Option to return intersection of pairs returned from indexers rather than union
#160
chriskl
opened
3 years ago
0
Compare.compute return real score for each metric, not binary `0`/`1` after threshold.
#159
oyeromenko-ebsco
closed
3 years ago
3
Fix random indexer
#158
tteigman
closed
2 years ago
1
Recordlinkage, ValueError: index of DataFrame is not unique
#157
lsun907
opened
3 years ago
3
Network OnetoMany docs
#156
Davide-Bianchi
opened
3 years ago
0
ECM algorithm on large data sets
#155
gnatarajanmboard
opened
3 years ago
1
Update data_deduplication.rst
#154
hwong557
closed
2 years ago
0
Next