versity / versitygw

versity s3 gateway
https://www.versity.com/products/versitygw/
Apache License 2.0
194 stars 24 forks source link

[Feature] - Improve GetObject allocation overhead #919

Open lunixbochs opened 2 weeks ago

lunixbochs commented 2 weeks ago

GetObject causes quite a lot of allocations due to io.Copy falling back to a generic implementation which allocates an internal buffer. The way to avoid this is to make sure the OS can use the sendfile syscall for zero copy transfers where possible.

To transfer using sendfile, you basically need to io.Copy between a net.Conn and an os.File, with no wrapper types around either object. The first change needed is to skip using a SectionReader when transferring a whole file (i.e. when there's no Range request).

@@ -2896,7 +2901,11 @@ func (p *Posix) GetObject(_ context.Context, input *s3.GetObjectInput) (*s3.GetO
                return nil, fmt.Errorf("open object: %w", err)
        }

-       rdr := io.NewSectionReader(f, startOffset, length)
+       var body io.ReadCloser = f
+       if startOffset != 0 || length != objSize {
+           rdr := io.NewSectionReader(f, startOffset, length)
+           body = &backend.FileSectionReadCloser{R: rdr, F: f}
+       }

        return &s3.GetObjectOutput{
                AcceptRanges:    &acceptRange,
@@ -2910,7 +2919,7 @@ func (p *Posix) GetObject(_ context.Context, input *s3.GetObjectInput) (*s3.GetO
                ContentRange:    &contentRange,
                StorageClass:    types.StorageClassStandard,
                VersionId:       &versionId,
-               Body:            &backend.FileSectionReadCloser{R: rdr, F: f},
+               Body:            body,
        }, nil
 }

The second issue is that fasthttp wraps the net.Conn in a bufio.Writer, in which case Go will never use sendfile, because the heuristic uses a very simple type assertion. See https://github.com/valyala/fasthttp/issues/1889. I did work around this and confirmed if you bypass the bufio, sendfile starts working with versitygw and the high allocator load when serving files completely disappears.

(NOTE: for ranges, despite the underlying sendfile syscall supporting offset/size, I don't think Go has a way to signal that you want to use a partial sendfile for an io.Copy - the fast path only happens when sending an unwrapped os.File object with no ability afaict to limit the sending size)

benmcclelland commented 2 weeks ago

one option here might be to write a modified io.Copy that allows the wrapped types for sendfile since we know what the types will be in these cases

lunixbochs commented 2 weeks ago

As is this needs to be fixed with a fasthttp refactor, because they're writing into a bufio which can't even be unwrapped correctly.

My workaround was as follows:

(Honestly I think this is a design flaw in Go. You can't even work around the extra allocations yourself with io.CopyBuffer in this case because io.CopyBuffer will call the underlying WriteTo implementation which falls back to io.Copy and allocates)

lunixbochs commented 2 weeks ago

Okay, so we get to copyZeroAlloc with writer being a bufio.Writer pointing at a net.Conn, and our reader is the os.File from GetObject

func copyZeroAlloc(w io.Writer, r io.Reader) (int64, error) {
    if wt, ok := r.(io.WriterTo); ok {
        return wt.WriteTo(w)
    }
    if rt, ok := w.(io.ReaderFrom); ok {
        return rt.ReadFrom(r)
    }
    vbuf := copyBufPool.Get()
    buf := vbuf.([]byte)
    n, err := io.CopyBuffer(w, r, buf)
    copyBufPool.Put(vbuf)
    return n, err
}

The problem at this point r.WriteTo() is on the unwrapped os.File, which will see bufio.Writer is not a legal sendfile target, and it will fall back to io.Copy

However, bufio.Writer.ReadFrom should skip the buffer and call ReadFrom on the underlying net.Conn

So it's very subtle, to fix this one case and break the opposite, we just need to flip the order of the checks in copyZeroAlloc

 func copyZeroAlloc(w io.Writer, r io.Reader) (int64, error) {
+    if rt, ok := w.(io.ReaderFrom); ok {
+        return rt.ReadFrom(r)
+    }
     if wt, ok := r.(io.WriterTo); ok {
         return wt.WriteTo(w)
     }
-    if rt, ok := w.(io.ReaderFrom); ok {
-        return rt.ReadFrom(r)
-    }
     vbuf := copyBufPool.Get()
     buf := vbuf.([]byte)
     n, err := io.CopyBuffer(w, r, buf)
     copyBufPool.Put(vbuf)
     return n, err
 }

However! This is only the correct order for server->client transfers, when you're copying to a bufio.Writer If you're copying client->server, fasthttp wraps that direction in a bufio.Reader, so we need to make sure it reaches bufio.Reader.WriteTo() on the reader in that case.

copyZeroAlloc (and io.CopyBuffer which it's based on) are currently both a trap in this case, because they will do the wrong thing when writing to a bufio.Writer, but accidentally do the right thing when reading from bufio.Reader due to the order of the checks.

It might be simplest to add this check to copyZeroAlloc to flip the order when you're writing into a bufio.Writer.

 func copyZeroAlloc(w io.Writer, r io.Reader) (int64, error) {
+    // prefer w.ReadFrom with bufio.Writer to allow sendfile
+    if bw, ok := w.(*bufio.Writer); ok {
+        return bw.ReadFrom(r)
+    }
     if wt, ok := r.(io.WriterTo); ok {
         return wt.WriteTo(w)
     }
     if rt, ok := w.(io.ReaderFrom); ok {
         return rt.ReadFrom(r)
     }
     vbuf := copyBufPool.Get()
     buf := vbuf.([]byte)
     n, err := io.CopyBuffer(w, r, buf)
     copyBufPool.Put(vbuf)
     return n, err
 }