-
When the user enters a Property Name, check to see if similar names already exist in the same ecosystem. (Not sure how to define "similar" and detect them, maybe there's a standard text search functio…
-
So I am going to make a larger pull request on this, but I noticed there were some optimization problems with the gamma*() functions.
**Avoidance of factors**
I notice you coerce the inputs into …
-
I am trialing a RegEx feature for the openSquat.
`git clone https://github.com/atenreiro/regex_opensquat`
1- Make sure to install the requirements.txt
2-Modify keywords.txt
3-In the regex_mult…
-
This library is described as fuzzy string matching with Levenshtein distance. However, it doesn't seem to use Levenshtein at all?
fuzz.ratio("tide", "diet") returns:
- 50 with python-Levenshtein i…
-
Regression tests fail with PG12beta1 because floating point output is now more precise by default:
```
16:50:42 --- /tmp/autopkgtest.SxiwfA/tree/expected/test1.out 2019-05-21 14:50:21.000000000 +000…
df7cb updated
5 years ago
-
The openinv command autocompletes to the nearest player if there isn't one found, which is fine, but on a server with hundreds of thousands of player data files it causes the command to take minutes t…
-
**Describe the bug**
Ran into out of range error when using tool with this CSV input:
```
"id","truth_value","family_name","given_name","gender","birth_date","phone","street_address","city","state"…
-
### Is your proposal related to a problem?
I would like to be able to use Splink with embedding-based similarity functions, specifically with duckdb and Athena backends.
For example, to evaluate…
-
Add handling for poorly named files:
- [x] Synthesise poorly named files
- [x] Update classifier logic to factor in poorly named files
-
The new rust backend appears to lead to a pretty steep performance regression in the hamming implementation:
## Old
![hamming_old](https://user-images.githubusercontent.com/44199644/233227647-b342…