flightaware / speedtables

Speed tables is a high-performance memory-resident database. The speed table compiler reads a table definition and generates a set of C access routines to create, manipulate and search tables containing millions of rows. Currently oriented towards Tcl.
https://flightaware.github.io/speedtables/
BSD 3-Clause "New" or "Revised" License
66 stars 15 forks source link

add support for deduplication on varstring column types #33

Open bovine opened 11 years ago

bovine commented 11 years ago

Boost supports a "flyweight" template that would allow easily implementation of deduplication of string values, which could provide a significant memory size reduction if values tend to be repeated a lot. (Basically allowing your tables to be denormalized but without the full storage overhead.)

http://www.boost.org/doc/libs/1_51_0/libs/flyweight/doc/tutorial/basics.html

The column could be defined as something like: varstring filename notnull 1 default "" dedupe 1

resuna commented 11 years ago

Speedtables is not in C++, so this would be kind of tricky.

bovine commented 11 years ago

My new "cpp" branch of speedtables is C++ and uses boost :)

resuna commented 11 years ago

That sounds... exciting? Adventurous? :)

apnadkarni commented 11 years ago

Please please please keep C as an option if you move to C++. Deployment is much simpler, at least for binary distributions on Windows platforms.

As an aside, when working from Tcl, deduplication should be easy enough for the application by defining field as tclobj