Closed kihashi closed 6 years ago
Awesome, thanks for thinking about this. So far, the patches are mainly used for the match/matchlist endpoints, so it would be ideal if it synced up well with matches. I've been leaning towards your (2), where we use the datetimes when matches were played to figure out the correct time. Like you said, that it quite a bit more difficult. Even your (1) option is better than what we have now.
Ideally there would be a single key dedicated to making these calls, so even if it takes quite a few calls it won't be a big deal. Even the rate limits for a dev key should be good enough.
I know that the static data endpoints pull their data directly from datadragon, so they will be updated at the same time as datadragon.
I thought gameIds and matchIds were the same. If you save a current game's gameId, wait an hour, then query the match endpoint with that saved gameId, don't you get the same match back? If so, that's a pretty good way of always keep track of very recently played games.
Another issue involved here is getting the updated patches to users. We can update it on github, but they won't have it locally. My guess is that we will need to configure Cass to treat patches like any other data, and pull it from some json file online.
Last, in-progress games are stopped if Riot rolls over a patch, so getting the patch datetime to within an hour should be considered "exact" imo.
Also, patches should be set per-region. Right now they are global, but that's not actually correct.
I thought gameIds and matchIds were the same.
You may be right. I know back in the old API, there was a /games
endpoint and a /matches
endpoint which had different ids and I assumed that since the gameId I was getting not working that was still true-ish. But I did not wait until the game was over, so that's probably what it was.
If that's the case, then we can do something like this:
The only thing I haven't thought of here is how to deal with downtime (for example, if the program stops running and a patch is deployed while it is running).
Also, patches should be set per-region. Right now they are global, but that's not actually correct.
Yeah. I meant to mention that in the OP. This logic will need to run for each region. We'll probably have to change the structure of the patches.json file (or else have multiple) to account for multiple regions.
Something like:
...
{
"season": "Season 7",
"name": "7.16",
"start": {
"NA": 1502251200.0,
"EUW": 1502251200.0,
...
},
"end": {
"NA": 1503460800.0,
"EUW": 1503460800.0,
...
},
},
...
Another issue involved here is getting the updated patches to users.
Yeah. You can just have it be pulled from github or some other CDN. The updated version should always be available at this address: https://raw.githubusercontent.com/meraki-analytics/cassiopeia/master/cassiopeia/patches.json.
You can have it download it whenever the master list is grabbed and then have it fall back to the local file if for some reason the master list is not available.
This all sounds good. I like the new structure of the json data too.
The only thing I haven't thought of here is how to deal with downtime (for example, if the program stops running and a patch is deployed while it is running).
One possibility is to write a script that searches for a specific patch start datetime. Given a patch number, it will go through matches (probably like what you suggested by using masters + challengers players' match histories) and try to find when the patch started.
That's necessary for getting old patch data correct too.
If that's the case, then we can do something like this:
Sounds good. I imagine two functions and a main ~ like this:
def get_featured_game():
"""Returns a summoners rift featured (current) game if one exists; else None"""
def get_match_data(match_id):
"""Returns the match's version and start datetime."""
def main():
"""Every 5 minutes get a new featured game. Put it in a queue.
After a featured game is requested, check the queue. If a game has been in the
queue for an hour, get the match data.
If the match's version/patch is new, update the patch info (make sure to flush to disk).
"""
Feel free to modify of course.
Update: We moved this file to https://github.com/CommunityDragon/Data/blob/master/patches.json
Below is the code that can be used to update it. This code could be run on a server using crontab, and the push to update the data could be done automatically (maybe using a PR with additional information about the error checking output). There should be some error checking to make sure the automatically identified timestamp is correct. Sometimes one region releases much later than the others, for example, and that will throw off the mean calculation.
import arrow
import datetime
from natsort import natsorted
import json
import numpy as np
from pathlib import Path
import os
import datapipelines
import cassiopeia as cass
from cassiopeia import Region, Queue
FILEPATH = Path("$HOME/.../CommunityDragon/Data/patches.json")
FILEPATH = Path(os.path.expandvars(FILEPATH))
"""
Current thoughts on auto-updating patch info:
Check every 6 hours for updates to a Realms endpoint.
If the version is updated (and we don't have patch info for it), run the below script.
Then put the output in the patches.json file and push it.
"""
def find_patch_start_date(start: arrow.Arrow, patch_name: str, region: Region, allowable_interval=datetime.timedelta(hours=6)):
start = start - datetime.timedelta(days=1)
end = arrow.now()
challengers = cass.get_challenger_league(Queue.ranked_solo_fives, region=region)
for entry in challengers.entries:
summoner = entry.summoner
mh = cass.get_match_history(summoner=summoner,
region=region,
begin_time=start,
end_time=end,
queues={Queue.ranked_solo_fives})
for match in mh:
try:
match_patch_name = '.'.join(match.version.split('.')[:2])
except datapipelines.NotFoundError:
continue
if match_patch_name == patch_name and match.creation < end:
end = match.creation
print(f"New patch end time: {end}")
patch_major_minor = patch_name.split(".")[:2]
patch_major_minor = (int(patch_major_minor[0]), int(patch_major_minor[1]))
match_major_minor = match_patch_name.split(".")[:2]
match_major_minor = (int(match_major_minor[0]), int(match_major_minor[1]))
if match_major_minor < patch_major_minor and match.creation > start:
start = match.creation
print(f"New patch start time: {start}")
if end - start < allowable_interval:
return start, end
if match.creation < start or match.creation > end:
break
print("WARNING! Did not converge.")
return start, end
def get_unknown_patches(use_versions_endpoint=False):
with open(FILEPATH) as f:
patches = json.load(f)["patches"]
missing = set()
for region in Region:
realms = cass.get_realms(region=region)
latest_versions = natsorted(realms.latest_versions.values())
latest_version = latest_versions[-1]
latest_version = ".".join(latest_version.split(".")[:2])
if use_versions_endpoint:
versions_latest_version = cass.get_versions(region="NA")[0]
versions_latest_version = ".".join(versions_latest_version.split(".")[:2])
latest_version = natsorted([latest_version, versions_latest_version])[1]
latest_patch = patches[-1]["name"]
if latest_patch != latest_version:
missing.add(latest_version)
return sorted(missing)
def update_patch_data(region_results, patch_name):
with open(FILEPATH) as f:
patch_data = json.load(f)
shifts = patch_data["shifts"]
for region, ts in region_results.items():
region_results[region] = ts.shift(seconds=-shifts[region])
for region, ts in region_results.items():
print(region, ts)
mean = arrow_mean(region_results.values())
# Assume the patch was released at 8 AM UTC
previous_day = arrow.get(mean.shift(days=-1).date()) + datetime.timedelta(hours=8)
today = arrow.get(mean.shift(days=0).date()) + datetime.timedelta(hours=8)
next_day = arrow.get(mean.shift(days=1).date()) + datetime.timedelta(hours=8)
days = [previous_day, today, next_day]
diffs = [abs(mean - day) for day in days]
correct_day = days[np.argmin(diffs)]
season = cass.Season.season_8.id
patch = {"name": patch_name, "start": correct_day.timestamp, "season": season}
patch_data["patches"].append(patch)
with open(FILEPATH, "w") as f:
json.dump(patch_data, f, indent=2)
def arrow_mean(arrows):
mean = np.mean([dt.timestamp for dt in arrows])
return arrow.get(mean)
def main():
missing = get_unknown_patches(use_versions_endpoint=True)
if missing:
print("Missing:")
print(missing)
print()
results = {}
for region in Region:
for patch_name in missing:
print("{}: Finding start time for patch: {}...".format(region, patch_name))
start, end = find_patch_start_date(start=arrow.now() - datetime.timedelta(days=100),
patch_name=patch_name,
region=region
)
middle = start + (end-start)/2
print(start, end, end-start)
print(region, patch_name, middle.timestamp)
results[region.platform.value] = middle
update_patch_data(results, patch_name)
cass.configuration.settings.clear_sinks(cass.Patch)
return
if __name__ == "__main__":
main()
From http://cassiopeia.readthedocs.io/en/latest/contributing.html#contributions:
I have a couple ideas of how to go about this and I was hoping to get some feedback before I started working on a PR.
/lol/static-data/v3/versions
) every hour (or some other appropriate interval) and checks to see if there is a change. If there is, add the new patch, mark that time as the patch start time and as the previous patch's end time.[it's not guaranteed](https://discussion.developer.riotgames.com/questions/30/how-long-does-it-take-static-data-data-dragon-to-u.html)
.Both of these approaches get you a fairly accurate start time, but not an exact one. It should be good enough for most purposes though. Does anyone have any opinions or suggestions about the methods I proposed above?