arenanet / api-cdi

Collaborative Development Initiative for Public APIs
258 stars 41 forks source link

WvW Kills & Deaths Only Update at Tick End #417

Open mcd1992 opened 7 years ago

mcd1992 commented 7 years ago

Separate issue as asked in https://github.com/arenanet/api-cdi/issues/412#issuecomment-268037670

Currently the /v2/wvw/matches and stats endpoints only update their kills/deaths when 'reading the logs emitted by the map instance servers'. I'm seeing a correlation that this log line is emitted when a map's tick ends. Sometimes it doesn't though, I presume its due to the 'best-effort' system (UDP?) that gets data to the API servers.

If possible it would be nice to have kills and deaths updated as quickly as score/points currently does. Not sure if this would require some infrastructure changes that are out of the project scope or what-have-you. Just putting in the issue/request so its noted.

On a side note (don't answer if its top secret arenanet stuffs), what are some of the infra. problems currently. You say best-effort which makes me think logs are sent from the instance servers to your API systems via UDP (GELF/Syslog?). At my workplace we have a very noisy network which made me switch from GELF to the Beats protocol to ensure I don't miss any login/auth attempts on our servers. Logstash+Filebeat works on Windows (I noticed IIS headers from the API servers) and is free/open-source. With logstash you could take the syslog(?) data from the machines and 'convert' it to libbeats across the noisy network then another logstash instance to libbeat->syslog. Kind of a drop-in workaround for reliable transport? Might be of interest?

Regardless, thanks for your API work @lye !

lye commented 7 years ago

On a side note (don't answer if its top secret arenanet stuffs), what are some of the infra. problems currently.

With the exception of the map instance servers, most of the backend infrastructure uses a request-response messaging format to send/retrieve data around. The messages are routed around on a star-topology mesh to one of potentially several servers that can handle that type of message (or that are responsible for the object the message is querying -- kind of like riak's coord networking model). The servers responsible for routing the messages will, during adverse conditions, apply backpressure to prevent the network from falling over - they'll basically start returning timeouts and such which have to be responded to. I believe that the logs coming off the map instance servers are only buffered to a limit, so if they can't be successfully emitted they're eventually dropped. Haven't read through that particular code, but that's my limited understanding, at least.

There was a GDC talk (I think) a few years ago about our distributed infrastructure. It's all entirely bespoke, for better or worse.

Anyway, looking at the actual log message that contains the KDR -- I believe the message is only emitted by the map instance server when one of the following occurs:

The score updates are totally orthogonal to the KDR message -- the KDR message is kind of a legacy diagnostic that I might have abused a bit. The scores use a separate system that has better properties for latency/delivery guarantees. Ultimately I need to gut the current implementation and replace it with something that's a little more flexible (there's a log message emitted whenever a player dies, I think).

(details mostly for my future benefit since winter vacation starts next week)

mcd1992 commented 7 years ago

Wow, that's way more intricate than I was imagining.

I believe this is the GDC talk you're referring to: http://www.gdcvault.com/play/1016640/Guild-Wars-2-Programming-the It looks interesting, I'll be sure to give it a watch sometime.

Thanks for the details and have a great vacation 🎉