slatedb / slatedb

A cloud native embedded storage engine built on object storage.
https://slatedb.io
Apache License 2.0
1.42k stars 66 forks source link

Allow async fetching data from object store #176

Open flaneur2020 opened 2 months ago

flaneur2020 commented 2 months ago

a follow up of #9

from discord: https://discord.com/channels/1232385660460204122/1232385660946616433/1280048188514107449

criccomini commented 2 months ago

Adding a little color here:

Right now, the partition fetches are synchronous. So a single 4KiB block read can result in a 16MiB object storage read because we're reading the entire partition. A 16MiB is going to have (much) higher latency than a 4KiB read. Some users will prefer to have the 4KiB latency on their 4KiB read, but they might still wish to have a read cache. We should let such users enable asynchronous object partition fetches. This will result in 2 reads to object storage (one for the 4KiB read that was requested, and one for the 16MiB object partition for caching purposes). The trade off here is the cost of two API calls (thus, increased cloud spend); a tradeoff the users will choose to take in exchange for better latency.

criccomini commented 2 months ago

@flaneur2020 This was in the original issue title:

allow not blocking the normal get/put operations

Does my summary above account for this, or were you thinking of something else?

flaneur2020 commented 2 months ago

@criccomini I think your description in the comment is much clearer, thank you for the detailed explanation :+1:

flaneur2020 commented 1 month ago

came across a slide from Jeff Dean which might be related with this issue:

image

this technique is called as "Backup Task" in this slide, which is being used to hide the latency.

i also found a similiar term "Backup Request" in bRPC doc:

Sometimes in order to ensure availability, we need to visit two services at the same time and get the result coming back first.

https://brpc.apache.org/docs/client/backup-request/

criccomini commented 1 month ago

Ohh, interesting. I guess the idea is to work around cases where a given S3 GET is slow for a one-off reason. More than one parallel fetch of the same data, and just use whichever comes back first. The tradeoff is API bill, right? Seems like a reasonable config, though. Would be good to test and see how it improves tail latency.