izaakschroeder / vinyl-s3

Use S3 as a source or destination of vinyl files.
20 stars 16 forks source link

s3.src('mybucket', {read:false}); could be faster #23

Open jamestalmage opened 9 years ago

jamestalmage commented 9 years ago

I tried to using vinyl-s3 to fetch metadata on existing object in my bucket. I see the {read: false} option discussed, but things still move pretty slow.

I ended up using the aws-sdk directly and doing

var s3 = new AWS.S3();
s3.listObjects({
  Bucket: 'myBucket'
});

This returned the same metadata more than 100x faster. Any reason for this?

izaakschroeder commented 9 years ago

headObject gets called to collect information about the file first: https://github.com/izaakschroeder/vinyl-s3/blob/master/lib/read-stream.js#L76 (e.g. contentType, length, stat.lastModified etc.) This doesn't happen with listObjects; or at least it didn't when I wrote this library. If that's changed then it should be possible to remove that call.

jamestalmage commented 9 years ago

From this example looks like you should be able to get length, stat.lastModified, and ETag info.

Not sure if it returns contentType, I will check.

jamestalmage commented 9 years ago

Unfortunately contentType is not available.

Here is the typical response I see (and yes, content type is set):

{ 
  Key: 'lib/js/ui-bootstrap.js',
  LastModified: Wed Sep 02 2015 18:34:58 GMT-0400 (EDT),
  ETag: '"2e06f6e991ca39d9e13cce26c8d55fd6"',
  Size: 14452,
  StorageClass: 'STANDARD' 
}
izaakschroeder commented 9 years ago

You can try this: https://github.com/izaakschroeder/vinyl-s3/pull/25 I have to test it a little more thoroughly yet, but feel free to play with it.

jamestalmage commented 9 years ago

LGTM.

My only thought would be some way to still populate the metadata that's quickly available via listObjects (lastModified, size, and ETag). One option would be to automatically populate those regardless of options, but I'm guessing the intent was to avoid polluting the file object when meta is false.

izaakschroeder commented 9 years ago

Hm yes. I may revisit that, since there's not many reasons for not having that data there (other than it being non-standard).

jamestalmage commented 9 years ago

Only potential problem I see is namespace collision with some unknown other plugin.

I think that's really unlikely though