PokeAPI / pokebase

Python 3 wrapper for Pokéapi v2
BSD 3-Clause "New" or "Revised" License
286 stars 53 forks source link

Recent refactor results in significant performance hit #14

Open jrubinator opened 5 years ago

jrubinator commented 5 years ago

I was looking at the current state of the refactor, and noticed that there's a significant performance drop off (about an order of magnitude) from the changes.

Namely, once the cache is set, retrieving a pokemon takes an order of magnitude longer (jumping from taking ~.25s to ~3s).

 ~/programming/pokebase (pre-refactor)$ time python3 speed-test-old.py 

real    0m41.041s
user    0m0.796s
sys 0m0.065s
 ~/programming/pokebase (pre-refactor)$ time python3 speed-test-old.py 

real    0m0.241s
user    0m0.217s
sys 0m0.021s
 ~/programming/pokebase (pre-refactor)$ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
 ~/programming/pokebase (master)$ time python3 speed-test.py 

real    0m31.404s
user    0m1.572s
sys 0m0.546s
 ~/programming/pokebase (master)$ time python3 speed-test.py 

real    0m3.095s
user    0m0.940s
sys 0m0.482s
 ~/programming/pokebase (master)$ cat speed-test.py 
import pokebase
from pokebase.cache import set_cache

set_cache('speed-test-cache')

pokebase.pokemon('jigglypuff')
 ~/programming/pokebase (master)$ cat speed-test-old.py 
import pokebase
from pokebase.api import set_cache

set_cache('speed-test-cache-old')

pokebase.pokemon('jigglypuff')

Originally posted by @jrubinator in https://github.com/GregHilmes/pokebase/issues/10#issuecomment-414134798

GregHilmes commented 5 years ago

How very thorough of you, I appreciate this a bunch.

Unfortunately, I didn't finish my rewrite of pokebase before heading off to school, and I currently don't have the time to support it.

I believe I intended to try out a few caching methods. The old json tree was too complex in my opinion. Currently pokebase is using the shelve module, which I chose for its simplicity. Unfortunately, it is slower, as you have pointed out. I was planning on moving away from shelve, as it presented the difficulty of behaving differently on different platforms.

My next idea was to try some form of SQL (mysql?), but I didn't get the chance to try it out. If you're up to it, I'd be very interested in a pull request with a better caching algorithm/hard disk representation.