ipfs-inactive / bifrost-gateway

Experimental gateway with delegated backend. No longer maintained, consider migrating to https://github.com/ipfs/rainbow/
Other
41 stars 20 forks source link

Adjust size of in-memory block cache #47

Open lidel opened 1 year ago

lidel commented 1 year ago

bifrost-gateway runs with in-memory 2Q cache with size set to 1024 blocks.

2Q is an enhancement over the standard LRU cache in that it tracks both frequently and recently used entries separately. This avoids a burst in access to new entries from evicting frequently used entries.

Current cache performance: ~50% cache HIT

Cache metrics from bifrost-stage1-ny after one day (~48%):

ipfs_http_blockstore_cache_hit 7.273594e+06
ipfs_http_blockstore_cache_requests 1.515003e+07

And second sample from other day (~50%):

ipfs_http_blockstore_cache_hit 2.7508843e+07
ipfs_http_blockstore_cache_requests 5.4966088e+07

iiuc the above means that in-memory "frecency" cache of 1024 blocks produces cache HIT ~50% of time.
This is not that surprising, every website will cause the same parent blocks to be read multiple times for every subresource on a page.

We run on machines that have 64GiB of memory and bifrost-gateway only utilizes ~5GiB.

Proposal: increase cache size

Improving cache hit here won't improve things like video seeking or fetching big files, but will have impact for how fast popular websites and directory enumerations load, avoiding trashing of the most popular content.

Tasks

lidel commented 1 year ago

I've restarted bifrost-stage1-ny with BLOCK_CACHE_SIZE increased from 1024 to 2048.

2023/02/24 00:57:10 Starting bifrost-gateway 2023-02-22-c0c8fa3
2023/02/24 00:57:10 Block cache size: 2048

I will check Friday EOD if cache hit ratio improved, or latency/error rate decreased in any significant way.

lidel commented 1 year ago

after ~12h:

ipfs_http_blockstore_cache_hit 1.0137958e+07
ipfs_http_blockstore_cache_requests 1.9706903e+07

hits still at ~51%, which confirms that we could increase it further to save on more roundtrips.

I am setting it to 4096 now:

2023/02/24 13:03:22 Starting bifrost-gateway 2023-02-24-dca4ba9
2023/02/24 13:03:22 Block cache size: 4096
lidel commented 1 year ago

in 6h 4096 produced "only" 40% hit ratio

ipfs_http_blockstore_cache_hit 1.297952e+06
ipfs_http_blockstore_cache_requests 3.199352e+06
lidel commented 1 year ago

After weekend, the BLOCK_CACHE_SIZE=4096 result on bifrost-stage1-ny was still around ~41%:

ipfs_http_blockstore_cache_hit 5.595748e+06
ipfs_http_blockstore_cache_requests 1.3580514e+07
lidel commented 1 year ago

Cache hit % is pretty successful across instances, including staging where we deployed #61 with Graph API enabled:

2023-04-03_23-39

I am going to double the cache size on staging to 8192 and check if it makes any difference for graph backend.

Done:

2023/04/03 21:47:39 Starting bifrost-gateway 2023-04-03-e68f6ca
lidel commented 1 year ago

8192 produces similar hit ratio:

Screenshot 2023-04-04 at 21-14-52 bifrost-gw staging metrics - Project Rhea - Dashboards - Grafana

Memory usage is minimal:

Screenshot 2023-04-04 at 21-21-29 bifrost-gw staging metrics - Project Rhea - Dashboards - Grafana

I've bumped it to 16384:

bifrost-gateway version 2023-04-04-575d307
2023/04/04 19:18:51 Starting bifrost-gateway 2023-04-04-575d307
[..]
2023/04/04 19:18:51 BLOCK_CACHE_SIZE: 16384