ronoaldo / ap-5r

Discord chatbot for Star Wars Galaxy of Heroes
Apache License 2.0
3 stars 2 forks source link

Lookup is not showing all results properly #22

Open toby200 opened 6 years ago

toby200 commented 6 years ago

Lookup command doesn't always return everybody that it should.

For example:

[6:47 PM] Tobiwan: /lookup Colonel Starck +7star
[6:47 PM] BOTAP-5R Protocol Droid: 2 players have Colonel Starck [+7star].
[6:47 PM] BOTAP-5R Protocol Droid: shadouw2
tyrovest

[6:42 PM] Tobiwan: /lookup Colonel Starck +6star
[6:42 PM] BOTAP-5R Protocol Droid: 5 players have Colonel Starck [+6star].
[6:42 PM] BOTAP-5R Protocol Droid: darsh
mayavi
murilo84sjc
shadouw2
tyrovest

But then using /server-info shows there should be 3 not 2 at 7 stars, and 9 not 5 at 6* or higher.

[10:35 AM] Tobiwan: /server-info Colonel Starck
[10:36 AM] BOTAP-5R Protocol Droid: From 176 AP-5R Bot Factory players, 32 have Colonel Starck

Stars:
3 at 7 stars
6 at 6 stars
14 at 5 stars
9 at 4 stars
ronoaldo commented 6 years ago

So this is a little odd, but happens because of cached data. Long story short, I started fetching data directly from the website for every command; once bot started to get rate-limits from the website, I implemented a cache, and this cache is not used by all commands. Hence server-info and lookup can return different results for several reasons (outlined bellow).

Background on the caching

After several attempts to write some sort of useful caching system that did not used a lot of memory, the best option was to parse the data once (since profiles changes only around 24hs) and save it to a shared DB where AP and other copies of it could load information from. Then I launched the https://swgoh-api.appspot.com/, a hosted API tool that crawls the site data and returns JSON (https://swgoh-api.appspot.com/v1/profile/ronoaldo). This tool returns saved data from Cloud Datastore, and this data is updated every 4 hours. A related issue is when this cache does not refresh. This can happen due to a bug in the parsing library: when a new character is introduced, the cache is not rebuilt and the API returns stale data. The API uses an IP pool and very very soft crawling so rate-limits here are uncommon or non-harmful.

What are the root causes of this issue?

Things can go wrong with server-info if the site rate-limit us: you may start to get inconsistent data between calls because the bot handles an rate-limit error as if the player does not have the character.

Things can go wrong with lookup if the API has stale data compared with the website, either because the API has an error and fails to update, or because the profile was updated but cache was not yet invalidated (cache is valid for at least 24 hours).

Suggestions to fix

Ideally both server-info and lookup should fetch data from the API only, not directly parse over and over and over the character data for each player on each command. This fix will make both consistent.

A second patch on the API to better handle caching (perhaps being more aggressive with caching?) should make then both commands as correct as possible due to the constraints.