issues
search
NickCrews
/
mismo
The SQL/Ibis powered sklearn of record linkage
https://nickcrews.github.io/mismo/
GNU Lesser General Public License v3.0
14
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Expose FindResults publicly
#75
NickCrews
opened
2 weeks ago
0
Incorrect distance constraint using CoordinateBlocker
#74
jstammers
closed
1 month ago
6
Add train using pairs
#73
lmores
closed
3 weeks ago
2
Skip tests that require spacy when it is not installed
#72
lmores
closed
2 weeks ago
3
Add splink-like string similarity comparisons
#71
jstammers
opened
1 month ago
0
Add implementation for built-in jaccard similarity
#70
jstammers
opened
1 month ago
3
Update string similarity measures
#69
jstammers
closed
1 month ago
3
Enhance String Comparison Utilities
#66
jstammers
opened
1 month ago
1
fix: edge case when n_possible_pairs < max_pairs
#65
jstammers
closed
1 month ago
1
Unable to call UDF-based LevelComparer within pyspark UDF
#64
jstammers
closed
1 month ago
7
Unable to sample when max_pairs is greater than or equal to n_possible_pairs
#63
jstammers
closed
1 month ago
2
JSON-serialization of Blocking Rules
#62
jstammers
opened
1 month ago
2
Fix usage of ibis.range()
#61
lmores
closed
1 month ago
1
Add missing prefix for f-strings
#60
lmores
closed
2 months ago
1
Invert color brightness in compared dashboard
#59
lmores
closed
2 months ago
1
High odds in Fellegi-Sunter model after expectation maximization
#57
lmores
closed
1 month ago
8
feat: implement reproducible table sampling
#56
NickCrews
opened
3 months ago
0
Enable unit tests using pyspark backend
#55
jstammers
opened
3 months ago
4
Add active learning API for predicate-based blocking and matching
#54
jstammers
opened
3 months ago
2
Unable to use 'cluster.connected_components' on a pyspark dataframe
#53
jstammers
closed
3 months ago
2
Feat: Implement additional fuzzy string similarities
#51
jstammers
opened
3 months ago
0
Incorrect join on large tables for add_tfidf
#50
jstammers
opened
4 months ago
4
Perf: optimize postal_parse_address implementation
#49
lmores
opened
4 months ago
3
postal_parse_address() performance
#48
lmores
closed
4 months ago
8
Address parsing performance
#47
lmores
opened
4 months ago
4
Add Levenshtein ratio to mismo.text
#46
jstammers
closed
5 months ago
1
DuckDb ConversionException when running MinHashLSH example
#45
lmores
closed
3 months ago
6
Add ability to sample from blocked pairs when training an FS model
#44
jstammers
opened
5 months ago
3
[ImgBot] Optimize images
#43
imgbot[bot]
closed
4 months ago
0
Poor scaling of add_tfidf to larger datasets
#42
jstammers
closed
5 months ago
7
expose liepzig affiliations dataset
#41
NickCrews
opened
6 months ago
0
Add Address Parsing and Comparison with Postal
#40
jstammers
closed
4 months ago
26
Fix typo in doc
#39
lmores
closed
6 months ago
1
Incremental clustering
#36
lmores
closed
7 months ago
1
Inefficient Sampling From Known Labels
#35
jstammers
closed
5 months ago
17
Add Leipzig affiliations raw dataset
#34
OlivierBinette
closed
6 months ago
1
Add RLData datasets
#33
OlivierBinette
closed
6 months ago
1
benchmarks for array.filter(x -> x.isin(<column from other relation>))
#32
NickCrews
closed
8 months ago
1
Add TF-IDF comparer based on sklearn
#31
NickCrews
opened
8 months ago
0
chore(deps): bump the github-actions group with 1 update
#30
dependabot[bot]
closed
8 months ago
0
joining on arrays is slow
#29
NickCrews
closed
9 months ago
3
explore ipydatagrid for showing data
#28
NickCrews
opened
9 months ago
0
feat: test on spark using docker
#27
NickCrews
opened
9 months ago
1
Consider supporting latent-entity based algorithms
#26
NickCrews
opened
9 months ago
0
Add RLdata and Union Army datasets
#25
OlivierBinette
closed
7 months ago
8
Add datasets
#24
OlivierBinette
closed
10 months ago
2
Minor changes to documentation and contribution guide
#23
OlivierBinette
closed
10 months ago
3
feat: plot clusters
#22
NickCrews
closed
10 months ago
1
Testing: test_fs is too computationally and memory intensive
#21
OlivierBinette
closed
10 months ago
8
Why deal with left and right tables?
#20
OlivierBinette
closed
11 months ago
4
Next