Closed chadq closed 7 years ago
found this because the directory I was checking only had a handful of files, but the root bucket had thousands and was causing the plugin to wait forever.
I verified that if no bucketfolder is provided that works as expected.
Oh, yep. I see what you are talking about. It is unnecessarily returning all the files and then just checking the ones that match the regex. Seems like a good idea to filter and just return the files we care about. Testing it now.
I had a bucket with 19406 files in it with several subdirectories. I used the bucketfolder option to limit the check to a directory with only 8 files. I added some temporary code to count the number of keys examined. The C value is before the re.match and the D value is after.
BEFORE
C VAL: 19406
D VAL: 8
CRITICAL: S3 files exceed time boundaries. - MIN:1hrs MAX:2550hrs - Files exceeding MAX time: 0 - Files meeting MIN time: 0 - Total file count: 8
real 0m4.077s
user 0m1.476s
sys 0m0.084s
AFTER
C VAL: 8
D VAL: 8
CRITICAL: S3 files exceed time boundaries. - MIN:1hrs MAX:2550hrs - Files exceeding MAX time: 0 - Files meeting MIN time: 0 - Total file count: 8
real 0m0.402s
user 0m0.040s
sys 0m0.048s
Your change prevents the unnecessary loops and provides quite a speed improvement. I also verified that the check still works as expected when no bucketfolder option is used.
Previously we were looping through bucket.list() regardlress if you provided a sub-directory to check. bucket.list() returns all the files in the root bucket. If we've provided a subdirectory we only want to check those files.