-
As we can't uniquely identify each space with some field or value, we have to develop a method or process to identify when data may be repeated in the database. I'm currently working in some schemes t…
-
**Is your feature request related to a problem? Please describe.**
- The only static libraries I have access to are a _slightly_ different version than my target executables.
- This causes them to b…
-
- Move away from asdf based download/install/list model
- Ports will be composed of
- **Resolver**: function that'll change the user provided `InstallConfig` into one that the port can work …
-
Currently, the fuzzy string matching is done by combining the similarity estimates from the Damerau-Levenshtein and Ratcliff-Obershelp algorithms in quadrature. This turns out to be far too slow in si…
-
### Context
A recurring need for us at ANCT is being able to match values that are "close enough". Often when importing pre-existing Excel files into Grist, we end up doing a lot of data-cleaning an…
-
-
With current master, I see the following behaviour:
```python
>>> from obspy import UTCDateTime
>>> dt1 = UTCDateTime(0.001, precision=2)
>>> dt2 = UTCDateTime(0.004, precision=2)
>>> dt3 = UTC…
-
The m4 comparison tools are the most complex and a lot of the time taken when changing things is in making sure we don't break cases.
A simple test suite, maybe just with `diff` on output, to make …
-
I would like to propose a new feature to enhance the duplicate detection capabilities of TwinTrim by introducing fuzzy matching technique. Currently, the tool relies on strict hashing to identify dupl…
-
Hi, I found this project by way of [your Stack Exchange post](https://dba.stackexchange.com/questions/72134/fast-hamming-distance-queries-in-postgres). I was considering trying something like this mys…