sociomantic-tsunami / dlsproto

Distributed Log Store protocol definition, client, fake node, and tests
Boost Software License 1.0
3 stars 18 forks source link

Request to get data from multiple ranges/channels? #9

Open gavin-norman-sociomantic opened 6 years ago

gavin-norman-sociomantic commented 6 years ago

A fairly common usage is for a client to request several chunks of data in different ranges / channels. How exactly they do this varies by application. Some might simply do sequential GetRanges, others might start a certain number of GetRanges in parallel, etc. Existing client code in fact second guesses the DLS, attempting to optimise the throughput by a certain GetRange assignment strategy.

It occurs to me that this is not ideal. The DLS nodes ought to be the ones deciding the most efficient way to serve requests, not the clients. The node implementation can change over time, meaning that clients' attempts at gaming the system might become outmoded or even horribly inefficient.

We should think about adding a request to get data from multiple ranges/channels, allowing the node to decide the best way to serve the data.

gavin-norman-sociomantic commented 6 years ago

I imagine we could test the performance of the DLS with different numbers of GetRanges in parallel and find the optimum point. From that, we could figure out how many ranges to handle in parallel for each multi-range request.