Semi-Indexing Semi-Structured Data in Tiny Space

http://www.di.unipi.it/~ottavian/files/semi_index_cikm.pdf

Imagine you have a collection of large JSON or XML documents, and you want to run queries over them that just grab a small subset of the data. You don't want to fully parse each document for each query, so you should index the documents somehow, but they all have different tree structures, and it's not clear what an index would look like. This paper shows how to augment each document with a small amount of data that makes it very fast to search inside them.

CompSciCabal / SMRTYPRTY

Semi-Indexing Semi-Structured Data in Tiny Space #62