Open HenryCaiHaiying opened 1 year ago
Thanks for the issue. We will think about this however right now the broker is always requesting the data without specifying the end position and such way we can not really say how much data have to be downloaded or moreover how much data will be actually read by the broker. So my understanding is that parallel fetching will lead to unnecessary downloads especially when the segments are big. IIRC there was a proposal from @jeqo to make a change on a broker side to always request maxBytes
from the plugin. That could help with this a little but I'm not sure when/if it will be changed.
At the same time we had some discussions about a little bit different approach, basically doing some pre-fetching to the cache of some (perhaps configurable but limited) amount of data.
At the same time we are trying to understand what will be the common read patterns in general (sequential read is obviously one of them) and trying to invent some related optimisations.
Just for reference, this was mentioned on the upstream PR introducing remote fetching: https://github.com/apache/kafka/pull/13535#discussion_r1166318900 but was mentioned to be fixed in a following PR -- not sure if this one: https://issues.apache.org/jira/browse/KAFKA-14915
I think the common use case from consumer side is doing bootstrap or catchup pulls, reading from this old offset and all the way to the tail of the queue until the consumer caught up. They would want all the data in remote storage for this topic partition. There might exist a few diagnostic use cases people only want a small segment of the partition data, but I don't think that's that common.
In either cases, I think you can do some prefetching to make sure the next buffer of the stream is ready to serve to the client, you don't have to pre-read too much. And if you want to be fancy, you can put in an adaptive prefetch algorithm. First time only prefetch 4K bytes, the the client comes back to ask more, prefetch 8K byte then 16K, ... until a cap (e.g. 10M) That probably can work well for both use cases.
As far as I remember we were briefly discussing exactly this with @ivanyu at some point(meaning adaptive fetching). I would say as soon as we will have this one https://github.com/aiven/tiered-storage-for-apache-kafka/pull/245 merged it should be easy to implement at least prefetching of some fixed amount of data but I think we will also consider both. Thanks for bringing this up again and for additional info about use cases. :+1:
@AnatolyPopov @ivanyu It seems https://github.com/aiven/tiered-storage-for-apache-kafka/pull/245 is already merged, are you guys going to come back to add some prefetching or caching support to speed up the S3 download?
Yeah, we're working on it
Thanks, that's a very useful feature.
@ivanyu Looping back to check the progress on this issue.
@HenryCaiHaiying, #403 and #411 should help to improve fetching performance. Have a look!
Thanks for the contribution, left a small comment in one of the PR
@jeqo with this PR: https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/pull/429, do we still need #403 ?
@HenryCaiHaiying The former supersedes the latter.
Currently the S3 download speed is low (about 20Mbit/s), the reason is we are fetching chunks in sequential order, can we do that in parallel (like how S3 multi-part upload is done in parallel)?
More details: https://the-asf.slack.com/archives/C05A1NF5SFM/p1687829723272589?thread_ts=1687245261.417759&cid=C05A1NF5SFM