phaistos-networks / TANK

A very high performance distributed log service
Apache License 2.0
938 stars 70 forks source link

Optimizations for consumer/tailing semantics #63

Closed markpapadakis closed 4 years ago

markpapadakis commented 6 years ago

There is a low-hanging fruit kind of optimization that’s relatively easy to implement that I expect will yield some great results. The vast majority of all partition consumers tail it, as opposed to consuming from some random sequence number in the sequence of the message numbers space of the partition. This means that the sequence number provided in the CONSUME request almost always is in the offsets range of the current/active segment. service.cpp#read_cur() is responsible for streaming from that current partition segment, and now it uses the skiplist to determine where to start reading from in the segment in order to get to the file offset for the sequence number requested.

That is to say, it will determine the offset in the log where the message with the highest seq.um that is <= target seq.num is found, and then use linear search and successive read()s to locate it. It will wind up invoking search_before_offset(), which is where we are incurring some sequential I/O overhead. That function will perform a linear search by reading 128-bytes at a time until it finds a message(set) with absolute sequence number > max absolute sequence number, in order to eventually determine the file offset of that message.

However, we should also use a small cache that fronts all requests to read_cur(). It should cache all results of read_cur() current implementation, and it should first look for a match in the cache(fast-path) before doing whatever read_cur() does not. That way, and because we can and should assume that the vast majority of CONSUME requests are, as stated, for the specific purpose of tailing the partition, and thus routed to read_cur(), We only need a small cache(maybe a few hundred values in size/capacity), We don’t even need to bother with selecting an optimal replacement policy; we can just use as simple std::unordered_map<> and whenever it reaches a certain size, clear() it.

markpapadakis commented 6 years ago

This should also hold for adjust_range_start()

markpapadakis commented 4 years ago

New design and implementation in TANK2