Open mac-chaffee opened 1 week ago
Hi! Wow, thanks for this in-depth feeback!
So I like all of these changes and can commit to getting this across the finish line.
There are a couple of pieces that require us to help:
_redirects
and _headers
5kb seems perfectly reasonable to me so let's go for it.
For caching, what do you think about the ability to wipe the in-memory cache? In particular:
Do you have any thoughts on supporting those features?
Finally, let me know what parts you'd like to work on and I can commit to whatever else.
Thanks again, this is awesome!
Hello! This weekend I was interested in learning how pgs worked so I looked through the code and wrote down any possible places that I thought could affect performance. I didn't do any runtime testing, so take this with a grain of salt.
To serve a single HTML file, the following must happen:
app_users.name
) (FindUserForName)projects.user_id && name
) (FindProjectByName)feature_flags.user_id && name
) (HasFeatureForUser)_redirects
, then parsed_redirects
(calcRoutes)_headers
, then parsed_headers
feature_flags.user_id && name
) (AnalyticsVisitFromRequest)analytics_visits
) (AnalyticsCollect)The following are some ideas for improving performance:
For (1), I predict the DNS lookup is the slowest operation in the list since (I think) all the other operations don't leave your single VM. Are you using Oracle Linux? If it's anything like RHEL, then local DNS caching is not enabled by default. If you enable systemd-resolved, it will cache 4096 responses for up to 2 hours and it will respect the TTL. Users should be encouraged to set high TTLs (>1 hour) to improve performance.
For the database queries (3, 5, 6, 13), an easy win would be to fetch the
feature_flags
in the same query where we fetch the user, but then we'd still be performing 2 queries per request. Possibly caching is a better solution, see below.For the GetBucket() call (4), that will send a BucketExists() request to Minio. Technically that's not necessary since you can create Bucket objects using just the name of the bucket.
For the GetObject() calls to read
_redirects
and_headers
, I think caching is our only hope. Caching these would also allow us to cache the compiled regexes.Caching
Since all of this data is small, I think we could use an in-process, in-memory cache like https://github.com/hashicorp/golang-lru
(the 2q implementation sounds smarter since it considers frequency in addition to recency)NVM, if we want to set ttls, we have to use the LRU version. The following work would be required:_redirects
and_headers
to something like 5KB so our cache size is bounded.AssetHandler
struct (plus routes parsed from _redirects and _headers) and save it to the cache, keyed on theuser-project
slug with a default TTL of something reasonable like 1 hour (so any caching bug we happen to introduce resolves itself in 1 hour).user-project
slug key that we need for reading from our own cache._redirects
or_headers
file uploaded (or I guess we could clear the cache on any upload as a user-controlled way of clearing their own cache)If we do all that, then we can serve assets with a single locally-cached DNS lookup, a single hash table lookup, and a single GetObject() minio call! 🚀
Thoughts? I can contribute for some of this. It's a pretty big change, so just wanted to run it by you before diving too deep.