versity / versitygw

versity s3 gateway
https://www.versity.com/products/versitygw/
Apache License 2.0
181 stars 23 forks source link

[Feature] - Reduce syscall overhead of serving files #920

Open lunixbochs opened 4 hours ago

lunixbochs commented 4 hours ago

Recorded with perf trace -p $(pgrep versitygw) and then fetching one file with GetObject

// AclParser middleware
( 0.025 ms): newfstatat(dfd: CWD, filename: 0x32c2a0, statbuf: 0xc0001c66b8)
( 0.031 ms): getxattr(pathname: 0x32c2b8, name: 0x32c2c0, value: 0xc000596000, size: 1024)

// _, err := os.Stat(bucket) in Posix.GetBucketAcl
( 0.009 ms): newfstatat(dfd: CWD, filename: 0x32c2e0, statbuf: 0xc0001c6858)

// fi, err := os.Stat(objPath) in Posix.GetObject
( 0.020 ms): newfstatat(dfd: CWD, filename: 0x1cade0, statbuf: 0xc0001c6928)

// Posix.loadUserMetaData, called from Posix.GetObject
( 0.016 ms): listxattr(pathname: 0x1cae40, list: 0x2036c40)

// b, _ := p.meta.RetrieveAttribute(nil, bucket, object, contentTypeHdr) in Posix.GetObject
( 0.017 ms): getxattr(pathname: 0x1caea0, name: 0x32c300, value: 0xc000596400, size: 1024)

// b, _ = p.meta.RetrieveAttribute(nil, bucket, object, contentEncHdr) in Posix.GetObject
( 0.016 ms): getxattr(pathname: 0x1caf00, name: 0x5400c0, value: 0xc000596800, size: 1024)

// os.Open(objPath) in Posix.GetObject
( 0.017 ms): openat(dfd: CWD, filename: 0x1caf30, flags: RDONLY|CLOEXEC)

4 of the 8 syscalls can be removed here with no downsides:

Impact:

- ( 0.025 ms): newfstatat(dfd: CWD, filename: 0x32c2a0, statbuf: 0xc0001c66b8)
  ( 0.031 ms): getxattr(pathname: 0x32c2b8, name: 0x32c2c0, value: 0xc000596000, size: 1024)
- ( 0.009 ms): newfstatat(dfd: CWD, filename: 0x32c2e0, statbuf: 0xc0001c6858)
  ( 0.020 ms): newfstatat(dfd: CWD, filename: 0x1cade0, statbuf: 0xc0001c6928)
  ( 0.016 ms): listxattr(pathname: 0x1cae40, list: 0x2036c40)
- ( 0.017 ms): getxattr(pathname: 0x1caea0, name: 0x32c300, value: 0xc000596400, size: 1024)
- ( 0.016 ms): getxattr(pathname: 0x1caf00, name: 0x5400c0, value: 0xc000596800, size: 1024)
  ( 0.017 ms): openat(dfd: CWD, filename: 0x1caf30, flags: RDONLY|CLOEXEC)

This is a vaguely 44% ~(151us -> 84us) syscall overhead reduction per GetObject. Could be worth shaving that further to ~53us (65% reduction) by caching the bucket ACL getxattr with like a 1s ttl, which will amortize it out completely under heavy load.

lunixbochs commented 2 hours ago

In trying to reduce this I noticed os.Stat(bucket) is actually responsible for a lot of code across the posix endpoints - is there some way you'd feel comfortable removing almost all of the redundant calls to os.Stat(bucket) for bucket exists and finding another way to make sure it's hit in GetBucketAcl or via the middleware at an appropriate time for each backend? Almost 6% of the code in posix.go is redundant os.Stat(bucket) calls (because it was already called in the acl middleware)