stashapp / stash

An organizer for your porn, written in Go. Documentation: https://docs.stashapp.cc
https://stashapp.cc/
GNU Affero General Public License v3.0
8.78k stars 778 forks source link

[Feature] Support Specifying SQLite Cache Size #5155

Open anon3e5f opened 3 weeks ago

anon3e5f commented 3 weeks ago

Is your feature request related to a problem? Please describe.

Stash can be slow, depending on library size (think data hoarders). Loading the dashboard can take upwards of 2-3 seconds on a cold start. I noticed that after loading the dashboard, refreshing the dashboard is quite a bit quicker (~500ms). My Stash SQLite database is currently ~110MiB. My current instance is using about 2.4GiB of RAM (with a limit set for 4GiB).

Describe the solution you'd like

Since SQLite does not cache queries, only data pages, the speedup after first load tells me caching might help. I would propose adding support to Stash to set the SQLite cache size when opening the database.

As documented, the default cache setting is -2000 or about 2MiB. Ideally this would be a setting a user could set, but I don't think it would be reasonable to add full support to the UI.

I would propose adding an environment variable e.g. STASH_SQLITE_CACHE_SIZE, which would be a simple pass-through to SQLite (the value can be negative or positive).

PRAGMA schema.cache_size = pages;
PRAGMA schema.cache_size = -kibibytes;

Care should be taken to avoid SQL injection attacks, although, if set via an environment variable I think it can be assumed safe.

I'm guessing this could be added to https://github.com/stashapp/stash/blob/develop/pkg/sqlite/driver.go,

Similar to the existing optimize call:

https://github.com/stashapp/stash/blob/427c18be7de58d4d63edc69669b75bc08436f1d0/pkg/sqlite/driver.go#L66-L72

Additional context

Plex and Jellyfin have both added support for setting the SQLite database cache size, with Plex users noting improvements in performance. Jellyfin bottleneck tends to be lock contention, but I've noticed improvements nonetheless. Jellyfin uses an environment variable, while Plex added support for setting this in the UI.

This is my NSFW account, kind of sad I need to do this...

feederbox826 commented 3 weeks ago

Genuine question, please call me out if I'm being stupid

why not just set pragma cache_size manually via sqlite CLI? It doesnt look like it's being overwritten

anon3e5f commented 3 weeks ago

Good question @feederbox826, from the docs:

When you change the cache size using the cache_size pragma, the change only endures for the current session. The cache size reverts to the default value when the database is closed and reopened.

It's really something that the application needs to do unfortunately.

WithoutPants commented 3 weeks ago

This is pretty simple to add, though I'd like to see some evidence of improvement.

You can try this yourself using the graphql playground (<host>:<port>/playground) with the following mutation:

mutation {
  execSQL(sql:"PRAGMA cache_size = -4000") {
    rows_affected
  }
}

Change the value to one you see fit. It should be maintained until restart. Please let me know what sort of performance improvement you get.

anon3e5f commented 3 weeks ago

Neat! I didn't know SQL was exposed. And fair enough!

I've been messing with it, but I'm running into an issue:

mutation {
  execSQL(sql:"PRAGMA cache_size = -2000000") {
    rows_affected
  }
}

This works of course, querying to confirm:

mutation {
  querySQL(sql:"PRAGMA cache_size;") {
    rows
  }
}

Returns the expected -2000000.

But then, after a little bit, making the same query returns the SQLite default again:

{
  "data": {
    "querySQL": {
      "rows": [
        [
          -2000
        ]
      ]
    }
  }
}

I'm guessing Stash is closing the DB connection and opening it again, losing any connection scoped settings? If this is the case, caching wouldn't do much good - and this would explain why Stash randomly takes a while to load - without a cold restart (I figured it was being migrated between nodes in my cluster for workload balancing purposes, but no, there's no logs of a migration).


I looked a bit deeper to confirm a proxy wasn't doing something wonky, but no, Stash really is taking a bit to load.

Cold :
2024-08-21T19:58:55.209745152-05:00 INF Response: 200 OK httpRequest={"proto":"HTTP/1.1","remoteIP":"10.42.0.37:55840","requestID":"cb5d90936f05d6fed90291197b184c45","requestMethod":"POST","requestPath":"/graphql","requestURL":"http://stash.blah/graphql"} httpResponse={"bytes":5031,"elapsed":23907.107798,"status":200} service=stash

Warm:
2024-08-21T20:02:29.817715177-05:00 INF Response: 200 OK httpRequest={"proto":"HTTP/1.1","remoteIP":"10.42.0.37:38518","requestID":"b545660bf2fa7eae152b9f9cf815d7a6","requestMethod":"POST","requestPath":"/graphql","requestURL":"http://stash.blah/graphql"} httpResponse={"bytes":5031,"elapsed":20425.847162,"status":200} service=stash

I did notice that in my historical logs, Stash latency would randomly spike, like above. Analyzing the past 1782 requests got me this (cutting off anything taking more than 10 seconds):

chart

I'm thinking that, in lue of application caching (e.g. SQLite cache_size), Stash is depending on the OS page cache, which is why Stash can be just hit by a latency spike (as the page cache is LRU). Ideally, Stash would maintain a SQLite connection, keeping the DB in memory, given the typically small size of the database.

anon3e5f commented 3 weeks ago

And that was it, after preloading/locking the SQLite db into page cache using mlock, cold start hits were under 200ms (coming from 23 seconds). It appears fetching the whole 100MiB file is way faster than letting SQLite lazy load individual pages - kind of crazy (although, my Stash instance is on HDD's).

So given that, I wonder if it would be good for Stash to attempt to keep the db in memory, as most users can't aggressively keep the db in memory e.g. by keeping the SQLite connection alive.