minotar / imgd

Minotar is a global avatar service that pulls your head off of your Minecraft.net skin, and allows it for use on several thousand sites.
https://minotar.net/
The Unlicense
203 stars 62 forks source link

Cache Reorganization for UUID #158

Closed LukeHandle closed 6 years ago

LukeHandle commented 9 years ago

My current plan is based on an idea from @lukegb.

Texture.Hash : Image (or gob of Image + Source or Alpha?) UUID : Skin.Hash:Cape.Hash Username : Skin.Hash:Cape.Hash or UUID (probably preferring the hashes)

Other advantages gained include the ability to query the cache for the presence of a key/Texture.Hash when we recieve an If-None-Match header and returning a Not Modified without having to fetch the skin. Probably a minor performance improvement, but it should lower latency :smile:

Will have to work out handling Steve hashes so we don't end up always responding to those requests without ever re-checking the user.

LukeHandle commented 9 years ago
I wrote this in a gist a few months back, but would be best here.

Is there another alternative to Redis that may offer supplier cache retrieval?

Where possible splitting off into different Redis instances also offers advantages. We can attempt to multi-thread the process.

Key System:

Different caching methods which we could use for getting a skin based on UUID or USERNAME.

Idea 1:

SKIN HASH => GOB of minecraft.Skin object
(this not unique to a user)

UUID      => SKIN HASH
USERNAME  => SKIN HASH

Idea 2:

UUID => GOB of minecraft.Skin object

USERNAME => UUID

The chances are, most users will have a unique skin. Will the gain from caching the non-unique ones be advantageous over the cost of then caching the SKIN HASH 2 extra times within Redis?

Number of Cache Lookups to get Skin:

Idea 2 should result in a quicker skin retrieval from Cache.

Maybe an Idea 3:

OBJECT HASH => GOB of minecraft.Skin object
(this *is* unique to a user)

UUID      => OBJECT HASH
USERNAME  => OBJECT HASH

Worst usage of space as we cause every skin to be stored, AND we then cache the HASH twice more. Each retrieval still requires 2 lookups:

The minor advantage gained is that we can perform "isset" on Redis when receiving "If-None-Match" headers. We know that if it's in the cache, it's still valid. A 304 can then be generated in record time and we don't need to do a full retrieval, just a check whether the hash is there (skipping UUID or USERNAME lookups).

If we ever implemented a stale (grace) caching feature, this functionality would be lost. Also, probably not worth the minor gain to butcher the cache system.

_Probably prefer Idea 2_

connor4312 commented 9 years ago

There's also nothing terribly wrong with caching skins in memory on-instance. With some finesse, Go would certainly be able handle this kind of workload, and we'd be able to create a lookup method for specifically this purpose without having to form it to fit Redis or another service.

And as a passive effect, (ideally) lose a few order of magnitudes of retrieval time since we don't have to go through the whole process of talking to another service, transmitting the skins over the network, etc.

jomo commented 9 years ago

FYI: we had a very similar discussion here but ended up keeping something equal to your idea 1 because we were unable to solve the rate limiting problem of resolving usernames to UUIDs. (The username -> skin API is not rate limited)

Do you have a solution for that problem?

connor4312 commented 9 years ago

If we were to roll our own caching, we'd basically have two 'hashmaps' (which may actually be prefix trees or something like that, we'll look at performance later), for both the username and UUID, pointing to the same instance of the skin hash. Then we get to index both by username and UUID with almost no extra memory consumption.

connor4312 commented 8 years ago

I did some experimentation with prefix trees versus maps as far as performance goes. It seems like plain old maps would be the way to go here; they hovered around 2000 ns/op, and I was unable to get prefix trees under 4000 ns/op.

Unsurprisingly, both were much, much faster than Redis.

connor4312 commented 8 years ago

I did some work in https://github.com/minotar/imgd/commit/94448a23039d71d16f8c4c1e2018e16561f60b91. Unsurprisingly, a memory-based cache outperformed Redis on writes by 22x, and on reads by about 300x. This was storing a short string, so the difference would probably be larger in prod when storing larger images (there's no increase from Go, since we never need to copy/write the memory anywhere).

BenchmarkMemoryInsert     200000          7686 ns/op
BenchmarkRedisInsert       10000        169751 ns/op
BenchmarkMemoryFind  3000000           587 ns/op
BenchmarkRedisFind     10000        184367 ns/op

Memory consumption seemed comparable, however it's difficult to pin exact numbers on a system as large as Redis.

LukeHandle commented 8 years ago

The question of advantages/disadvantages of moving from central cache to localised.

Centralized means less requests to Mojang and allows us to distribute them among nodes vs. the speed improvements.

I think ideally, to allow us to make this decision with confidence (at least, for minotar.net), we would need to know the timings that we are spending on each operation per request?

ryush00 commented 6 years ago

Bump.

http://skins.minecraft.net/minecraftskins/username.png is dead.

https://bugs.mojang.com/browse/WEB-985

LukeHandle commented 6 years ago

To make it clear @ryush00, we are generally* now mitigated from this issue and our service can be used as a drop-in replacement (see #181).

Eg. https://minotar.net/skin/LukeHandle

The download endpoint is also available if required. Further improvements will be made here over the coming weeks - but this is mostly to reduce our costs.

We put more money into this to ensure the stability of the API since the change. I'll be optimizing this to ensure it is sustainable cost-wise, but until then, I'll take the monetary hit.

*I can get the API success rate if anyone is interested, but it's good 👏

LukeHandle commented 6 years ago

So, in the end we went with:

  1. Username -> UUID
  2. UUID -> UserData (gob including texture path)
  3. TexturePath -> PNG

The raw skin data (3) is currently in-memory - using https://github.com/minotar/imgd/commit/94448a23039d71d16f8c4c1e2018e16561f60b91 heavily for inspiration (thanks @connor4312 - also we miss you/hope you are doing good :+1: )

Whereas for caches 1 & 2, we are heavily rate-limited, 3 is less painful and centralised is less important. Due to incredibly low memory footprint of imgd we also have spare RAM to play with on those hosts. Consequently, we allow more memory for Redis to run and can have bigger caches there.

Still improvements we can make here, but good stuff :+1: