versity / versitygw

versity s3 gateway
https://www.versity.com/products/versitygw/
Apache License 2.0
202 stars 25 forks source link

[Feature] - Reduce syscall overhead of serving files #920

Open lunixbochs opened 1 month ago

lunixbochs commented 1 month ago

Recorded with perf trace -p $(pgrep versitygw) and then fetching one file with GetObject

// AclParser middleware
( 0.025 ms): newfstatat(dfd: CWD, filename: 0x32c2a0, statbuf: 0xc0001c66b8)
( 0.031 ms): getxattr(pathname: 0x32c2b8, name: 0x32c2c0, value: 0xc000596000, size: 1024)

// _, err := os.Stat(bucket) in Posix.GetBucketAcl
( 0.009 ms): newfstatat(dfd: CWD, filename: 0x32c2e0, statbuf: 0xc0001c6858)

// fi, err := os.Stat(objPath) in Posix.GetObject
( 0.020 ms): newfstatat(dfd: CWD, filename: 0x1cade0, statbuf: 0xc0001c6928)

// Posix.loadUserMetaData, called from Posix.GetObject
( 0.016 ms): listxattr(pathname: 0x1cae40, list: 0x2036c40)

// b, _ := p.meta.RetrieveAttribute(nil, bucket, object, contentTypeHdr) in Posix.GetObject
( 0.017 ms): getxattr(pathname: 0x1caea0, name: 0x32c300, value: 0xc000596400, size: 1024)

// b, _ = p.meta.RetrieveAttribute(nil, bucket, object, contentEncHdr) in Posix.GetObject
( 0.016 ms): getxattr(pathname: 0x1caf00, name: 0x5400c0, value: 0xc000596800, size: 1024)

// os.Open(objPath) in Posix.GetObject
( 0.017 ms): openat(dfd: CWD, filename: 0x1caf30, flags: RDONLY|CLOEXEC)

4 of the 8 syscalls can be removed here with no downsides:

Impact:

- ( 0.025 ms): newfstatat(dfd: CWD, filename: 0x32c2a0, statbuf: 0xc0001c66b8)
  ( 0.031 ms): getxattr(pathname: 0x32c2b8, name: 0x32c2c0, value: 0xc000596000, size: 1024)
- ( 0.009 ms): newfstatat(dfd: CWD, filename: 0x32c2e0, statbuf: 0xc0001c6858)
  ( 0.020 ms): newfstatat(dfd: CWD, filename: 0x1cade0, statbuf: 0xc0001c6928)
  ( 0.016 ms): listxattr(pathname: 0x1cae40, list: 0x2036c40)
- ( 0.017 ms): getxattr(pathname: 0x1caea0, name: 0x32c300, value: 0xc000596400, size: 1024)
- ( 0.016 ms): getxattr(pathname: 0x1caf00, name: 0x5400c0, value: 0xc000596800, size: 1024)
  ( 0.017 ms): openat(dfd: CWD, filename: 0x1caf30, flags: RDONLY|CLOEXEC)

This is a vaguely 44% ~(151us -> 84us) syscall overhead reduction per GetObject. Could be worth shaving that further to ~53us (65% reduction) by caching the bucket ACL getxattr with like a 1s ttl, which will amortize it out completely under heavy load.

lunixbochs commented 1 month ago

In trying to reduce this I noticed os.Stat(bucket) is actually responsible for a lot of code across the posix endpoints - is there some way you'd feel comfortable removing almost all of the redundant calls to os.Stat(bucket) for bucket exists and finding another way to make sure it's hit in GetBucketAcl or via the middleware at an appropriate time for each backend? Almost 6% of the code in posix.go is redundant os.Stat(bucket) calls (because it was already called in the acl middleware)

benmcclelland commented 1 month ago

Wow! Thanks for diving into this and opening some PRs too. I have no problem removing the redundant calls. We will need to double check that there are functional tests coving all the cases of these redundant calls just in case something gets refactored in the future.

benmcclelland commented 1 month ago

Ignore the azurite failures in the tests, there is another PR outstanding to fix this.