Optimize retrieval from Filecoin

gammazero commented 11 months ago

Optimize retrieval so that when requested retrieval ranges do not align with singularity file ranges, only the minimal number of retrieval requests are made.

This is accomplished by creating a separate reader for each singularity file range. For reads that are larger than a range, multiple ranges are read until the read request is satisfied or until all data is read. For reads smaller than the amount of data remaining in the range, the range reader is maintained so that it can continue to be read from by subsequent reads.

This approach associates a reader with each Singularity file range, and not the ranges requested via the API (in HTTP range header). This avoids needing to parse the range header in order to create readers where each reads some number of Singularity ranges. Rather, as arbitrary requested ranges are read, an existing reader for the corresponding singularity range(s) is reused if the requested range falls on a singularity range from a previous read. This also means that there is only a single retrieval for each singularity range, whereas if readers were associated with requested ranges then multiple readers could overlap the same singularity range and require multiple retrievals of the same range.

Fixes https://github.com/data-preservation-programs/singularity/issues/366 Fixes https://github.com/filecoin-project/motion/issues/143

As an optimization, only one singularity range reader is maintained at a time. This works because once a new singularity range is selected by the requested range read, then it is highly unlikely that a subsequent read request will fall on a a singularity range that was already read from, previous to the new one.

Additional changes:

The filecoinReader implementation supports the io.WriteTo interface to allow direct copying to an io.Writer.
local and non-local readers can handle forward seeks within the current range without additional retrieval.
The FilecoinRetriever interface supports the RetrieveReader function that returns an io.ReadCloser to read data from.

Benchmark

Added a benchmark of filecoin retrievals to compare before and after optimization.

Benchmark uses a file composed of 4 sections, each 16Mib in size. The entire file is retrieved by requesting 1Mib chunks. The 1Mib reads are done through io.CopyN, which copies the data through a 32k buffer.

The non-optimized version does a retrieval for each buffer copy to copy the file data. The optimized version only does as many retrievals as there are independently retrievable sections of the file.

Before optimization:

goos: darwin
goarch: arm64
pkg: github.com/data-preservation-programs/singularity/handler/file
BenchmarkFilecoinRetrieve
    retrieve_test.go:403: Number of retrieve requests: 2048
    retrieve_test.go:403: Number of retrieve requests: 2048
    retrieve_test.go:403: Number of retrieve requests: 2048
BenchmarkFilecoinRetrieve-10                   4         328260781 ns/op

After optimization: (PR https://github.com/data-preservation-programs/singularity/pull/404)

goos: darwin
goarch: arm64
pkg: github.com/data-preservation-programs/singularity/handler/file
BenchmarkFilecoinRetrieve
    retrieve_test.go:692: Number of retrieve requests: 4
    retrieve_test.go:692: Number of retrieve requests: 4
    retrieve_test.go:692: Number of retrieve requests: 4
BenchmarkFilecoinRetrieve-10                  27          37292077 ns/op

codecov[bot] commented 11 months ago

Codecov Report

Attention: 60 lines in your changes are missing coverage. Please review.

Comparison is base (5742fa8) 73.94% compared to head (0bd67a0) 73.83%.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #404 +/- ## ========================================== - Coverage 73.94% 73.83% -0.12% ========================================== Files 148 149 +1 Lines 9669 9813 +144 ========================================== + Hits 7150 7245 +95 - Misses 1776 1815 +39 - Partials 743 753 +10 ``` | [Files](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs) | Coverage Δ | | |---|---|---| | [handler/file/interface.go](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs#diff-aGFuZGxlci9maWxlL2ludGVyZmFjZS5nbw==) | `100.00% <ø> (ø)` | | | [handler/file/range\_reader.go](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs#diff-aGFuZGxlci9maWxlL3JhbmdlX3JlYWRlci5nbw==) | `78.37% <78.37%> (ø)` | | | [storagesystem/util.go](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs#diff-c3RvcmFnZXN5c3RlbS91dGlsLmdv) | `79.22% <61.29%> (+0.27%)` | :arrow_up: | | [retriever/retriever.go](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs#diff-cmV0cmlldmVyL3JldHJpZXZlci5nbw==) | `55.71% <0.00%> (-12.71%)` | :arrow_down: | | [handler/file/retrieve.go](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs#diff-aGFuZGxlci9maWxlL3JldHJpZXZlLmdv) | `62.68% <68.23%> (+7.21%)` | :arrow_up: | ... and [2 files with indirect coverage changes](https://app.codecov.io/gh/data-preservation-programs/singularity/pull/404/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=data-preservation-programs)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rvagg commented 11 months ago

I think this is OK and solves the bulk of the issues at the singularity level, but unless I'm not looking at it right, you're going to end up having to do basically the same style of thing in motion. Over there, the singularity blob reader still only implements a basic io.ReadSeeker and motion uses http.ServeContent so it'll still end up doing multiple small reads through io.CopyN for each range requested. That'll end up here, but each of those reads will instantiate a new version of all of this and we're back to lots of smaller reads and no ability to take advantage of WriteTo.

So what to do about that? I think you're going to have to do something persistent in motion that hangs on to a RetrieveFile reference and is able to do a WriteTo over there as well. That could be possible, SingularityReader is held on to for the entirety of the operation so if it gets a WriteTo and can do this same juggling with a handle into singularity then we might solve this all.

But then after all that I guess you have the complexity question—is it better to solve this through the abstractions that let us continue to use http.ServeContent (and many other utilities and operations that assume standard io interfaces), or would it be simpler to parse the ranges and pass them down, returning a single reader that does it all? :shrug:

gammazero commented 11 months ago

I'm against having an actual read happening inside Seek(). If we really need to add this optimization, I'd recommend adding a field seekMode which allows the caller to choose the behavior, so that ...

I agree, as that seems like the more correct/expected behavior. I have removed that handling of a seek within a read, as well as the corresponding handling here. Now the code does not continue reading from the same stream (pipe reader) following a Seek.

rvagg commented 11 months ago

Worth clarifying this point because I think it's kind of important to what's being achieved here:

This looks inefficient, but given http.ServeContent will never seek again once Read() starts, this path will never be reached

This is unfortunately not true because as long as we accept HTTP spec semantics of Range headers then we can get stacked ranges in the same query, and http.ServeContent will faithfully skip around the ranges as requested, seeking each time: https://cs.opensource.google/go/go/+/refs/tags/go1.21.3:src/net/http/fs.go;l=325-337

I hate this about the spec and would love to know what utility it has for general use, and specifically whether it's even going to be useful to Motion users. Perhaps we can't know this ahead of time and we just have to accept it. My inclination would be to ditch the Go Range handling entirely and just accept a singular range, rejecting requests with commas in the range request and then fix it later if we find a tool/user of motion that needs something more sophisticated.

But, we can also just go along with it, and I think the solution @gammazero has here gets at this problem.

data-preservation-programs / singularity

Optimize retrieval from Filecoin #404

Benchmark

Codecov Report