abdullin / cellar

Append-only storage in Go designed for the analytical workloads
https://abdullin.com/bitgn/real-time-analytics/
BSD 3-Clause "New" or "Revised" License
62 stars 6 forks source link
append-only compressed go lmdb storage

Cellar

Build Status

Cellar is the append-only storage backend in Go designed for the analytical workloads. It replaces geyser-net.

Core features:

This storage takes ideas from the Message Vault, which was based on the ideas of Kafka and append-only storage in Lokad.CQRS

Analytical pipeline on top of this library was deployed at HappyPancake to run real-time aggregation and long-term data analysis on the largest social website in Sweden. You can read more about it in Real-time Analytics with Go and LMDB.

Contributors

In the alphabetical order:

Don't hesitate to send a PR to include your profile.

Design

Cellar stores data in a very simple manner:

Writing

You can have only one writer at a time. This writer has two operations:

The store is optimized for throughput. You can efficiently execute thousands of appends followed by a single call to Checkpoint.

Whenever a buffer is about to overflow (exceed the predefined max size), it will be "sealed" into an immutable chunk (compressed, encrypted and added to the chunk table) and replaced by a new buffer.

See tests in writer_test.go for sample usage patters (for both writing and reading).

Reading

At any point in time multiple readers could be created via NewReader(folder, encryptionKey). You can optionally configure reader after creation by setting StartPos or EndPos to constrain reading to a part of the database.

Readers have following operations available:

Unit tests in writer_test.go feature use of readers as well.

Note, that the reader tries to help you in achieving maximum throughput. While reading events from the chunk, it will decrypt and unpack the entire file in one go, allocating a memory buffer. All individual event reads will be performed against this buffer.

Example: Incremental Reporting

This library was used as a building block for capturing millions and billions of events and then running reports on them. Consider a following example of building an incremental reporting pipeline.

There is an external append-only storage with billions of events and a few terabytes of data (events are compressed separately with an equivalent of Snappy). It is located on a remote storage (cloud or a NAS). It is required to run custom reports on this data, refreshing them every hour.

Cellar storage could be used to serve as a local cache on a dedicated reporting machine (e.g. you can find an instance with 32GB of RAM, Intel Xeon and 500GB of NNVMe SSD under 100 EUR per month). Since Cellar storage compresses events in chunks, high compression ratio could be achieved. For instance, protobuf messages tend to get compression of 2-10 in chunks.

A solution might include an equivalent of a cron job that will execute following apps in sequence:

All these steps usually execute fast even on large datasets, since (1) and (2) are incremental and operate only on the fresh data. (3) can require full DB, however it works with the optimized and compacted data, hence it will be fast as well. To get the most performance, you might need to structure your messages for very fast reads without unnecessary memory allocations or CPU work (e.g. using something like FlatBuffers instead of JSON or ProtoBuf).

Note, that the compaction job is optional. However, on fairly large datasets, it might make sense to optimize messages for very fast reads, while discarding all the unnecessary information. Should the job requirements change, you'll need to update the compaction logic, discard the compacted store and re-process all the raw data from the start.

License

3-clause BSD license.