ShoobyDoo / OPGG.py

An unofficial Python library for accessing OPGG data.
BSD 3-Clause "New" or "Revised" License
9 stars 4 forks source link

Update cacher to account for the new object fields #31

Closed ShoobyDoo closed 2 months ago

ShoobyDoo commented 2 months ago

During the ongoing development of v1.4.0, many of the objects that had missing attributes were overhauled completely. The cacher however, is still caching data based on the old object parameters. This needs to be corrected in order for the cacher to properly rebuild objects that are cached post v1.4.0.

Additionally, some logic needs to be put in place to identify when rebuilding objects from cache fail due to key errors resulting from older cache files. This would just involve removing the old cache database and rebuilding it from scratch.

ShoobyDoo commented 2 months ago

Cacher has some serious performance issues after the modifications. I will need to investigate further. A typical search would take no more than a second or two. It now takes about 4-5 seconds to complete a search.

Logs reveal several cacher calls, indicating every single search is making the program read get_all_champs(), and then inserting as well. Very large amounts of data is unnecessarily being processed.

Perhaps its not worth caching the champions? They have dedicated endpoints meant for high volume traffic, could just remove caching anything but the summoner search. The summoner id is the most important because that requires a web request and then data scrape.

Edit: The performance issue was in the game object __repr__(). This was because the game object does not include the champion object as before, and instead just has the champion id, I included the following line to build them as a convenience:

champion={Utils.get_champion_by(By.ID, self.myData.champion_id)}

Going into get_champion_by(), we see that it queries the ENTIRE champion list EACH time for every champion in the games list. This is an enormous amount of data processing that is needlessly occurring. I believe the solution would be to cache the champions list weekly, and pull from the cache each time. (This was how it worked before, but I introduced some bugs during the massive changes to the game object and cacher.)