Open tgrandje opened 3 months ago
@hadrilec I think your in detail knowledge of INSEE-specific data is required to understand how long this tmpfiles actually need to last, please see my comment on the associated PR #206
Great, thanks!
functions using the save_df folder should delete the files
so for these, TemporaryFile
, NamedTemporaryFile
, or TemporaryDirectory
should be used in a context manager
data from _get_full_list_wfs should be kept for one python session
so we need an atexit
function (EDIT: no we don't, see below)
well, actually for _get_full_list_wfs it is the same as for the others, as it is only called by get_geodata_list which is using save_df
great, no atexit
, then!
I'll settle for removing the unnecessary tempfiles first :-) I've started it : it might (almost) be easier and (surely) cleaner than tracking the tempfiles which are opened in nested functions (and could therefore not really be handled through context managers).
I'm not sure about _create_insee_folder. As far as I can see, this is to prevent an exception when permissions are currently too low to create an appdata folder. In that case, I'd say the tempdir should stay, and the lru_cache on _get_temp_dir() prevents creating multiple tempdir during a session.
What about cleaning that?
Doesn't platformdirs basically guarantee that we'll have the rights to create the folder?
I wonder whether everything could not be replaced by platformdirs.user_cache_dir("pynsee", ensure_exists=True)
BTW: is there a reason why it's using a "pynsee" forlder inside the initial "pynsee" folder or should we clean that up?
Doesn't platformdirs basically guarantee that we'll have the rights to create the folder?
I sure hope so, but I'm no expert on that. If there wasn't that comment, I would already have gone for the platformdirs.user_cache_dir("pynsee", ensure_exists=True)
...
@hadrilec does that reminds you of anything?
EDIT: If we move to rely strictly on platformdirs, we can also get rid of the _get_temp_dir.py file: that call is the last one (I got rid of all the other to favor in-memory files instead of tempfiles).
if platformdirs works as expected _get_temp_dir will not be used. As it is safe this way, I would prefer not to remove it because I dont want to debbug in case platformsdirs fails.
BTW: is there a reason why it's using a "pynsee" forlder inside the initial "pynsee" folder or should we clean that up?
I suspect that's a typo. But if we don't want to break (too much) cache when upgrading pynsee, we should keep it that way.
this is not a typo, back then this folder was in different path location it was more tidy this way. I would prefer not to change this feature.
When running the tests for my actual PR, i've remarked pynsee is not (always ?) cleaning temporary files after it's finished. As a matter of fact, this has just filled the available space I still got on my windows/WSL partition.
For instance, that tmpfile is never cleaned afterwards.