Open john-kurkowski opened 4 years ago
We should make sure that the cache isn't shared between different versions of tldextract nor by different python virtualenvs even if they have the same version.
I'm working on a solution but can't promise a timeline.
Maybe also consider caching the content in a caching engine, like memcached. Or redis. Or elasticsearch?
Better yet: what about making the entire caching engine user extendable and ship it with a filesystem based engine, while opening it up to the advanced user?
@bastbnl I'd rather see if there is a way to better hide this complexity from the user entirely. On the other hand, if you saw a way to enable that without adding a lot of complexity that could be interesting.
Honestly, I want to be able to provide a directory with a a cache file that is the exact copy of the downloaded URLs, and just have it work. I tried to pry apart what is happening with caching, but everytime I restart my kernel, the first cache attempt tries to use the URLs in the list, and I get an ugly error. I want this bootstrapable, so the URL list should be able to be read as a ENV variable, completely hidden from my users. I don't want them to have to figure out how to get the cache file, I will manage that behind our proxy, and I don't want them to have to run tldextract differently to use a different instantiation variable to not get the error. This is way to complex for managed installations where I am handling things for my users.
Reconsider caching in the library's install folder. The GitHub issue tracker is rife with confusion about the permission warning (#9), or outright uncaught exceptions (#209). Finally do something about it. 😄
Some example approaches: