Open jasperjiaguo opened 1 year ago
This can potentially be a beginner task
@jasperjiaguo - if not already, can you please add oss user docs for the new enhancement of stream untar + rate limiting. Include a link here in this issue as well. Might also come handy for whoever picks this up to understand background etc.
@jasperjiaguo I can get started on this, if it's not yet been picked up.
Hey @heatclub it's not yet picked up, do you want to do both? You can take a look at files changed in https://github.com/apache/pinot/pull/8753. 2) is more exploratory as you probably want to find the block cipher scheme that has streamed interface (e.g. https://docs.oracle.com/javase/7/docs/api/javax/crypto/CipherInputStream.html) in java and generalize them (PinotCrypter is a noop rn so we need to foresee what cyphers can be used).
Yes I can pick up both the issues, will get back with queries if any
I wanted to better understand the goals of this issue which seem two in this case: add support for decryption (item 2) and I am not sure about item 1 above: are we suggesting we make the default code path retry and eliminate the need for configuration? if this is not the case, can we explain better what is what we want to achieve with that first item?.
Thanks for the clarification question @jugomezv. For 1) I'm thinking to add retry downloading on untar failed for the old (not streamed) segment download code path. As currently the new code path retries everything while old doesn't. The option for switching two code paths will still be there. Is this what you are asking?
Oh I think I get it, your fix is was only for streaming, ok I get it. Yes for sure we need that: will it also be an option in that case?
Oh I think I get it, your fix is was only for streaming, ok I get it. Yes for sure we need that: will it also be an option in that case?
Probably behind a flag... set it when the user deems deep store download could be unreliable?
@jasperjiaguo yes, makes sense.
Hi @jasperjiaguo Sorry I was inactive here owing to some personal commitments.
I went through the code and have below findings for 1
Existing flow
BaseTableDataManager.downloadSegment() -> downloadSegmentFromDeepStore() -> downloadAndDecrypt() -> downloadFromPeersWithoutStreaming calls these in SegmentFetcherFactory -> fetchAndDecryptSegmentToLocal() -> fetchAndDecryptSegmentToLocalInternal -> fetchSegmentToLocal -> fetchSegmentToLocalInternal -> fetchSegmentToLocal -> RetryPolicies.exponentialBackoffRetryPolicy with fetchSegmentToLocalWithoutRetry() -> **BaseTableDataManager.untarAndMoveSegment()**
All download functions have retry policy
New proposed flow
BaseTableDataManager.downloadSegment() -> downloadSegmentFromDeepStore() -> downloadAndDecrypt() -> { same function calls as above for SegmentFetcherFactory } -> **BaseTableDataManager.untarAndMoveSegmentWithRetries()**
New function BaseTableDataManager.untarAndMoveSegmentWithRetries() retries the download in case of failure in untar process.
Above proposal is based on the fact that we need to retry upon untar failures and not on download failures as all download failures are already retried in code, please feel free to correct me of that's not the case
@jasperjiaguo can you please take a look at this PR for issue [1]: https://github.com/apache/pinot/pull/10411 I don't seem to have permission to link this issue to PR or add labels.
Hey @heatclub sorry for the delayed reply, I was in smth. Yes your understanding is basically correct, we need to retry download upon download failures (which we already do), and retry download-untar upon untar failures. Will review the code this week.
Thanks for the revert @jasperjiaguo , above PR has received some feedback comments from team members and I am working on the refactoring.
Currently if we use the default code path for server segment download we won't retry download on untar failure. In situations as downloading partial tar file, this would not be helpful. There are two solutions:
BaseTableDataManager
pinot.server.instance.segment.stream.download.untar
The new stream download untar code path currently doesn't support stream decryption, as should be theoretically possible with block cipher. This potential improvement to stream segment download can be explored.
User doc for stream download untar: https://docs.pinot.apache.org/operators/tutorials/performance-optimization-configurations PR: https://github.com/apache/pinot/pull/8753