Where many folders (1000s) sit under a common parent folder, they are processed into zips individually.
This is OK if the contents are 100s of MiB, but very slow if contents are 10s of KiB, which is common.
These should be collated into a single zip one folder up the tree, meaning 1 upload instead of 1000s.
Look at lines 872 to 1040 in lsst-backup.py - extend to_collate dict to include total_size for each parent_folder and parent_parent_folder - this will allow parent_folders with common parent_parent_folders, that are small, to be aggregated.
Where many folders (1000s) sit under a common parent folder, they are processed into zips individually. This is OK if the contents are 100s of MiB, but very slow if contents are 10s of KiB, which is common. These should be collated into a single zip one folder up the tree, meaning 1 upload instead of 1000s.
Look at lines 872 to 1040 in lsst-backup.py - extend
to_collate
dict to includetotal_size
for eachparent_folder
andparent_parent_folder
- this will allowparent_folder
s with commonparent_parent_folder
s, that are small, to be aggregated.