structurizr / onpremises

Structurizr on-premises installation
https://docs.structurizr.com/onpremises
MIT License
141 stars 55 forks source link

Improve performance on Admin API GET workspaces endpoint #127

Closed iowaz closed 5 days ago

iowaz commented 1 week ago

Description

On my current usage of Structurizr, I have created a large number of workspaces (over 1500) and use the admin API for various tasks. The issue I am facing is the performance of the GET workspaces endpoint of the Admin API.

The problem occurs because the code is synchronously fetching each workspace's metadata. This is particularly slow if the storage is S3, for example.

The suggested fix is in the following Pull Request: https://github.com/structurizr/onpremises/pull/126. This code has been running for a few months now and appears to be stable. It has significantly improved performance, reducing the time by approximately 70%.

Priority

I'm willing to add this feature myself and raise a PR (please confirm approach first)

More information

No response

iowaz commented 1 week ago

Hey @simonbrowndotje, can you take a look at the Pull Request above? (https://github.com/structurizr/onpremises/pull/126)

Note: I apologize for not aligning the approach to solving the problem first. I wanted to see if it worked initially (plus the problem was occurring in our production environment). If you would like me to change the approach, feel free to ask, and I will adjust and test it as soon as possible!

simonbrowndotje commented 1 week ago

Performance issues are the reason I recommend file system workspace storage over the S3 implementation. That said, there's a caching facility built-in to the on-premises installation -> https://github.com/structurizr/onpremises/blob/main/structurizr-onpremises/src/main/java/com/structurizr/onpremises/component/workspace/WorkspaceComponentImpl.java#L83

This was specifically built for somebody, but they never provided any feedback (I think they just switched back to file system storage on a shared mounted volume), so it's never been documented. If you have a single server, add the following to your structurizr.properties file:

structurizr.cache=local

And if you have multiple servers, you'll need to use a distributed cache, such as Redis:

structurizr.cache=redis
structurizr.redis.host=localhost
structurizr.redis.port=6379
structurizr.redis.password=

The cache expiry can also be tweaked (the value is minutes):

structurizr.cache.expiry=10

You should see a huge performance improvement after the initial server startup.

There's value in your change too, so thanks for the PR, since it will allow the cache to be warmed more quickly, but it needs a couple of tweaks (e.g. creating a new ExecutorService each time getWorkspaces() is called may be an issue in high traffic installations). I will take your PR as a basis and make a few changes.

iowaz commented 1 week ago

We considered using a local file system, but while it's possible to share a mounted volume on Kubernetes, it's not something we prefer to do (there are many issues with this solution in Amazon EKS). That being said, I have to admit I missed the Redis caching option 😅.

By the way, that was fast! Thank you so much.