ping optimization: break out tasks

there was an old bug about ping optimization and i want to bring it back, on the topic of scalability. problem is that pings are sent to all clients in a room, which is no biggie by itself, but pings don't scale. They contain the user list for the whole "world", so the cost increases exponentially.

pings currently do many things at once which is also a problem, because i have seen in profiling that pings are a big spike in CPU and allocations due to all this. It's where we manually clean abandoned users, graffiti, calculate ping, and gather up a world population.

the reason it's done all at the same time was considered an optimization. it's better to calculate world pop once and send to all users, rather than calculate it each time a client asks for it on demand. but there are other ways.

improvement proposal: separate these tasks into individual timers and just cache the results. let clients fetch on demand, probably on another timer. this will allow us to do things MUCH less frequently without loss of UX. and it also decouples tasks so there's no spike in CPU.

and for calculating ping MS, it can now be done on a per-user basis.

related : #14

Notes after implementation: Massive optimizations, even though it won't affect much without many users.

Pings are now very light, ONLY for calculating pingMS. They are done on a per-user interval to help stagger traffic.
Room object cleanup is now done seldomly, and separate from pings
"World state", containing user data for each room, is now cached, available for clients to fetch on demand. It's rebuilt on a timer on the server, globally. Previously this was done for each room.
World state also only contains the 7jam users, no external users; they were not visible anyway
User ping & stats info is sent in a separate request, "RoomUserPings" or so. It contains only relevant data for the current room.
Both payloads serialize user data as arrays instead of objects, saving about 50% of space.

Overall the "ping" payload is now staggered, calculated at the global level, and went from about 14kb to 700 bytes.

I did no performance measurements other than some size profiling. Doing tests for scale is a bore.

thenfour / digifujam

ping optimization: break out tasks #257