The optional remove_extra feature has been in a broken state for a while due to several changes so file naming conventions in the last year
Solution
Adjusted the behaviour of the optional remove_extrapull() feature so that it removes any files and directories that would not be overwritten by the release currently being downloaded, essentially acting as a wipe of the entire images directory. However, it cannot be implemented this way in case there are difficulties downloading any file in the current release (that would result in missing files). Therefore, we:
1: Build a list of dataset file paths that would result from downloading the release
2: Remove any existing file that is not present in the above list
3: Since the above can result in empty local dataset directories, scan and remove empty directories from the images dataset directory
Changelog
Fixed logic of the remove_extra feature when pulling dataset releases
Problem
The optional
remove_extra
feature has been in a broken state for a while due to several changes so file naming conventions in the last yearSolution
Adjusted the behaviour of the optional
remove_extra
pull()
feature so that it removes any files and directories that would not be overwritten by the release currently being downloaded, essentially acting as a wipe of the entire images directory. However, it cannot be implemented this way in case there are difficulties downloading any file in the current release (that would result in missing files). Therefore, we:images
dataset directoryChangelog
Fixed logic of the
remove_extra
feature when pulling dataset releases