OliverSherouse / wbdata

A python library for accessing world bank data
GNU General Public License v2.0
182 stars 55 forks source link

feature request: option to turn cache on/off #23

Closed skoeb closed 4 years ago

skoeb commented 5 years ago

Thanks a ton for this package, it helps quite a bit with my work. One request though would be an option to turn the cache feature on/off as a keyword. For example: wb.get_data(cache='off').

I find that when the cache grows quite large (over 2000 recent searches) the time to complete a response slows down drastically because of the pickle.dump() in fetcher.py. For example, on my wifi a typical wb api call for a single indicator, 20 years, and all countries, takes around 2-3 seconds if the cache is clear. If the cache is full, it can take 12-14 seconds.

Additionally, once the cache reaches a certain size (over 3000 or so recent api calls), every API call results in a OSError until the cache is deleted and the kernel is restarted––similar to this thread, although for me it's an OSError instead of an EOFerror, running rm 'file/path/to/cache/' fixes it.

My solution to this has been to delete the following line within the fetch function of fetcher.py: CACHE[query_url] = (daycount(), raw_response)

I realize most users aren't making as many api calls as me (I'm trying to download every wb indicator for this project), but it would be nice to be able to turn the cache on and off in api calls.

skoeb commented 5 years ago

I don't have any experience with push requests, or working on github projects, but if someone's willing to hold my hand I can find sometime to work on this. Another option is to just make caching better. It could actually be the pickle.load() that's slowing this down when the cache grows large?

OliverSherouse commented 5 years ago

Yeah, my original cache concept was pretty poorly implemented, and I'm planning to fix it when I finally update. As a workaround, try sticking this at the top of your script:

import wbdata.fetcher

wbdata.fetcher.CACHE.sync = lambda x: pass

That will essentially stop the cache from saving.

BradleyBurrellSD commented 5 years ago

Hello there,

I've hit the same problem, but I can't seem to get your fix to work. I'm working with Python 3 in Intellij Community Edition and it's giving me and error after the : saying "expression expected". Any help would be greatly appreciated!