mmihaltz / pysettrie

python3 package supporting efficient storage and querying of sets of sets using the trie data structure. Supports finding all the supersets/subsets of a given set from a collection of sets. Also includes a trie-based mapping container where the keys are sets.
GNU Lesser General Public License v3.0
24 stars 6 forks source link

pysettrie

Build Status

https://github.com/mmihaltz/pysettrie

pysettrie is a python3 package that provides support for efficient storage and querying of sets of sets using the trie data structure, supporting operations like finding all the supersets/subsets of a given set from a collection of sets. The original motivation for this module was to provide efficient search for supersets of sets of feature-value pairs in our natural language parser project (e.g. matching nouns against verb argument positions).

The following classes are included:

For further information, please see documentation

Module test_settrie.py contains unittests for all the containers.

Author: Márton Miháltz https://sites.google.com/site/mmihaltz/

This package depends on the sortedcollection module. One recommended way to install (tested on Ubuntu):

sudo pip3 install sortedcontainers

If you don't have pip3:

sudo apt-get install python3-setuptools
sudo easy_install3 pip

pysettrie is partly based on: I.Savnik: Index data structure for fast subset and superset queries. CD-ARES, IFIP LNCS, 2013. http://osebje.famnit.upr.si/~savnik/papers/cdares13.pdf Remarks on paper:

See also:

Changes:

Licensed under the GNU LESSER GENERAL PUBLIC LICENSE, Version 3.