confetti-clj / s3-deploy

Simple utility functions to diff and sync local files with S3 buckets
Mozilla Public License 2.0
10 stars 4 forks source link

Consider retrieving object summaries individually #10

Open martinklepsch opened 8 years ago

martinklepsch commented 8 years ago

Currently we download object summaries instead of getting each objects data with an individual request.

  1. This is faster when we need sync information for a lot of objects
  2. If the number of objects in the bucket grows (1000s) this gets slower
  3. If the number of objects to sync is small retrieving their data individually might be faster
  4. Retrieving objects individually will not be enough when pruning the bucket.

Getting objects data individually as mentioned in 3 would also allow diffing and syncing of metadata.

There is no clear right way in this case. I see the following options:

I think adding logic is intransparent and might confuse so I'm thinking the latter option is best.

@podviaznikov any opinion to offer?

/via #9

podviaznikov commented 8 years ago

I wonder would be the time difference for say 100 objects? You can send individual requests in parallel, right?

martinklepsch commented 8 years ago

Will need to check that. On Wed, 9 Dec 2015 at 03:09, Anton Podviaznikov notifications@github.com wrote:

I wonder would be the time difference for say 100 objects? You can send individual requests in parallel, right?

— Reply to this email directly or view it on GitHub https://github.com/confetti-clj/s3-deploy/issues/10#issuecomment-163084172 .

martinklepsch commented 8 years ago

400 objects w/o any parallel processing:

martinklepsch commented 8 years ago

I did some very basic work towards this in the individual-diff branch. I'll release a 0.1.0 without it now and then we can cut a release with this later when it's done.