odota / web

React web interface for the OpenDota platform
https://www.opendota.com
MIT License
1.09k stars 394 forks source link

[Priority Support] /recentMatches API is unstable since several weeks #3210

Closed ramezsw closed 1 month ago

ramezsw commented 1 month ago

The dota /recentMatches API has been having issues and delays in reporting match data for players.

https://api.opendota.com/api/players/1163110468/recentMatches

For example, on today 29 September, the API is returning an ambiguous "Internal Server Error" when fetching data for some users, yet we know that the users have been playing recently, with their privacy settings set to the correct values, and can see their matches in other online dota trackers.

image

For the same player, we can see the recent match on https://www.dotabuff.com/matches/7965626122

Additionally, for the players that are properly returning results, the results data is not up to do date, and is at least multiple hours behind, as seen in below screenshots!

image Same player, with the actual recent match: https://www.dotabuff.com/matches/7965535188

Expected outcome:

This delay has been happening since September 15th on random and intermittent basis. We are looking to understand the cause of this problem, and gather confirmation around the reliability of the dota API especially when used in "time-critical applications" that cannot afford the multiple hour delay in getting response data. Understanding this problem and whether to expect such delays as often as it happens now will help us understand whether we should continue using and paying for this API, or migrating to more stable API provider.

howardchung commented 1 month ago

Looks like we had a Cassandra node issue that was blocking new match ingestion for a few hours (and failing to return responses for some players). Rebooting it and we should be catching up soon.

In the logs there appears to be some kind of memory leak issue. In general Cassandra has caused us some stability issues, so we may look to migrate to Scylla when possible.

We try to respond to issues as quickly as possible, but cannot always do so immediately. While we try to maintain a reasonable level of uptime, if you need critical levels of uptime you may have to consider running the code yourself. . .