EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.06k stars 191 forks source link

Remove full-depth traversal - only list items in the requested 'directory' #871

Closed sjuls closed 10 months ago

sjuls commented 10 months ago

Currently the azure container storage implementation of list_bucket does a depth-first traversal of the blob directory while the aws s3 implementation only lists the items in the target directory. Since doing the full traversal can be VERY time-consuming when a lot of wal segments are archived in the bucket this degrades recovery operations.

We are currently experiencing 10 minute download times for a 300 byte .history file because the tree walk lists all wal segments on all timelines.

This PR removes the recursive calls and only returns first-level content similar to aws s3 implementation.

sjuls commented 10 months ago

Hi @mikewallace1979,

Any chance I can get a review of this PR? This should bring list_bucket implementation of azure blob storage in-line with the aws s3 implementation so it should be safe.

Happy to make any change you deem necessary.

mikewallace1979 commented 10 months ago

Hi @sjuls - thanks for doing the analysis and proposing a fix, we will take a closer look this week.

sjuls commented 10 months ago

@mikewallace1979 Ah right, good catch. I've added a commit to remove the import.

Thanks for the review 🙏