Closed michaelkain closed 1 month ago
why not have the 'import' service delete de files immediately after having dealt with them ?
@a-ba That is currently the theory, but not in that case. We have this logic with datasets MS where we have the majority of the problems during import, but we need to do so with import MS too. I'll do this. And I'll add the 3 hour deletion as a security guard too. Jean-Côme
ah ok 'import' somehow has to hand over the responsibility of the file to the 'datasets' service
regarding the cleanup of old import files, I have just realised that we already have a tmpfiles.d config in production (though the timeout is set to 8 hours)
d {{ shanoir_tmp_dir }} 1777 root root 8h
Also i noted this also purges all other tmpfiles (like the tomcat tmp dirs) which is probably not a good idea. So I have just replaced it with:
d {{ shanoir_tmp_dir }} 1777 root root 1d
e {{ shanoir_tmp_dir }}/[0-9]* - - - 3h
to aggressively delete only the tmp imports (which always starts with a digit).
Ok, thanks Anthony ! I'll make a pr to clean /tmp/import_folder after an error occured during import.
Dear Jean-Come, this is a follow up on the mail of Anthony today. I think it would be great to add a job to ms import, that runs every hour. And checks per user, if there is an import folder, that is older than 3 hours and deletes it right away. We could add a "semaphore" file, in case an import has started, to avoid deletion of ongoing imports or something like this. With kind regards, Michael