Blazing fast library for fuzzy filtering, matching, and other fuzzy things!
Zadeh is a blazing fast library for fuzzy filtering, matching, and other fuzzy things. Zadeh is a multithreaded library written in C++ with the goal to search through a dataset with 1M entries in a few hundred milliseconds.
The name "Zadeh" refers to Lofti Zadeh, the creator of fuzzy logic and fuzzy systems.
StringArrayFilterer
)TreeFilterer
)`, hyphen
-, underline
_`)\
or //
)match
)score
)wrap
)StringArrayFilterer
and TreeFilterer
classes, and then perform filter
multiple times, which is much more efficient than calling the filter
or filterTree
functions directly every time.This is a header-only library. Include ./src/zadeh.h
and build it in your application.
examples/example1.cpp
:
#include "../src/zadeh.h" // include zadeh.h
#include <string>
#include <iostream>
using namespace std;
int main() {
// the data to fuzzy search on
auto data = vector<string>{"eye", "why", "bi"};
// setup StringArrayFilterer
auto strArrFilterer = zadeh::StringArrayFilterer<vector<string>, string>{};
strArrFilterer.set_candidates(data);
// filter the indices that match the query
auto filtered_indices = strArrFilterer.filter_indices("ye");
// print the filtered data
for (auto ind: filtered_indices) {
cout << data[ind] << '\n';
}
}
Cmake file:
cmake_minimum_required(VERSION 3.17)
project(example1 LANGUAGES CXX)
add_executable(example1 ./examples/example1.cpp)
target_compile_features(example1 PRIVATE cxx_std_17)
Build:
cmake -S . -B ./build && cmake --build ./build --config Debug
Installation:
npm install zadeh
To import all the functions:
import * as zadeh from "zadeh"
or
const zadeh = require("zadeh")
StringArrayFilterer
is a class that allows setting the candidates
only once and perform filtering on them multiple times. This is much more efficient than calling the filter
function directly.
Example:
const { StringArrayFilterer } = require("zadeh")
// create class
const strArrFilterer = new StringArrayFilterer()
// set the candidates
strArrFilterer.setCandidates(["Call", "Me", "Maybe"])
// call filter multiple times
strArrFilterer.filter("me")
strArrFilterer.filter("all")
ObjectArrayFilterer is a class that performs filtering on an array of objects based on a string stored in the given dataKey
for each object
Example:
const { ObjectArrayFilterer } = require("zadeh")
const candidates = [
{ name: "Call", id: 1 },
{ name: "Me", id: 2 },
{ name: "Maybe", id: 3 },
]
// create a class and set the candidates
const objArrFilterer = new ObjectArrayFilterer(candidates, "name") // filter based on their name
// call filter multiple times
objArrFilterer.filter("me") // [{ name: 'Me', id: 2 }, { name: 'Maybe', id: 3}] // finds two objects
objArrFilterer.filter("all") // [{ name: 'Call', id: 1 }]
TreeFilterer filters the given query in the nodes of the given array of trees and returns an array of filtered
trees (or the indices of the filter candidates). A tree object is an object in which each entry stores the data in its dataKey
, and it has (may have) some
children (with a similar structure) in its childrenKey
Example:
const { TreeFilterer } = require("zadeh")
const treeFilterer = new TreeFilterer()
const candidates = [
{ data: "bye1", children: [{ data: "hello" }] },
{ data: "Bye2", children: [{ data: "_bye4" }, { data: "hel" }] },
{ data: "eye" },
]
treeFilterer.setCandidates(candidates, "data", "children")
treeFilterer.filter("hel")
returns
;[
{ data: "Bye2", children: [{ data: "hel" }] },
{ data: "bye1", children: [{ data: "hello" }] },
]
treeFilterer.filter("bye")
returns
;[
{ data: "bye1", children: [] },
{ data: "Bye2", children: [{ data: "_bye4" }] },
{ data: "Bye2", children: [] },
]
treeFilterer.filterIndices("bye")
returns
;[
{ data: "bye1", index: 0, parent_indices: [] },
{ data: "_bye4", index: 0, parent_indices: [1] },
{ data: "Bye2", index: 1, parent_indices: [] },
]
score(string, query, options = {})
Score the given string against the given query.
string
- the string to score.query
- The query to score the string against.const { score } = require('zadeh')
score('Me', 'me') # 0.17099999999999999
score('Maybe', 'me') # 0.0693
match(string, query, options = {})
Gives an array of indices at which the query matches the given string
const { match } = require("zadeh")
match("Hello World", "he") // [0, 1]
match("Hello World", "wor") // [6, 7, 8]
match("Hello World", "elwor") // [1, 2, 6, 7, 8]
wrap (string, query, options = {})
Gives an HTML/Markdown string that highlights the range for which the match happens
wrap("helloworld", "he")
helloworld
wrap("Hello world", "he")
Hello world
In all the above functions, you can pass an optional object with the following keys
{
/** only for `filter` function */
/** The key to use when candidates is an object */
key?: T extends string ? never : keyof T
/** only for `filter` function */
maxResults?: number
/** @default false */
allowErrors?: boolean
/** @default true */
usePathScoring?: boolean
/** @default false */
useExtensionBonus?: boolean
pathSeparator?: '/' | '\\' | string
}
These deprecated functions are provided to support the API of fuzzaldrin
and fuzzaldrin-plus
.
However, you should replace their usage with StringArrayFilterer
or ObjectArrayFilterer
classes that allow setting the candidates only once and perform filtering on those candidates multiple times. This is much more efficient than filter
or filterTree
functions.
API is backward compatible with Fuzzaldrin and Fuzzaldrin-plus. Additional functions are provided to achieve better performance that could suit your needs
Zadeh achieves 10x-20x performance improvement over Fuzzaldrin plus for chromium project with 300K files. This high performance is achieved using the following techniques.
~4x
performance benefit.~4x
performance benefit.This project potentially solves the following Atom fuzzy-finder issues if used. https://github.com/atom/fuzzy-finder/issues/271 and https://github.com/atom/fuzzy-finder/issues/88