andrewrk / node-s3-client

high level amazon s3 client for node.js
MIT License
1k stars 303 forks source link

High Level Amazon S3 Client

Installation

npm install s3 --save

Features

See also the companion CLI tool which is meant to be a drop-in replacement for s3cmd: s3-cli.

Synopsis

Create a client

var s3 = require('s3');

var client = s3.createClient({
  maxAsyncS3: 20,     // this is the default
  s3RetryCount: 3,    // this is the default
  s3RetryDelay: 1000, // this is the default
  multipartUploadThreshold: 20971520, // this is the default (20 MB)
  multipartUploadSize: 15728640, // this is the default (15 MB)
  s3Options: {
    accessKeyId: "your s3 key",
    secretAccessKey: "your s3 secret",
    region: "your region",
    // endpoint: 's3.yourdomain.com',
    // sslEnabled: false
    // any other options are passed to new AWS.S3()
    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#constructor-property
  },
});

Create a client from existing AWS.S3 object

var s3 = require('s3');
var awsS3Client = new AWS.S3(s3Options);
var options = {
  s3Client: awsS3Client,
  // more options available. See API docs below.
};
var client = s3.createClient(options);

Upload a file to S3

var params = {
  localFile: "some/local/file",

  s3Params: {
    Bucket: "s3 bucket name",
    Key: "some/remote/file",
    // other options supported by putObject, except Body and ContentLength.
    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
  },
};
var uploader = client.uploadFile(params);
uploader.on('error', function(err) {
  console.error("unable to upload:", err.stack);
});
uploader.on('progress', function() {
  console.log("progress", uploader.progressMd5Amount,
            uploader.progressAmount, uploader.progressTotal);
});
uploader.on('end', function() {
  console.log("done uploading");
});

Download a file from S3

var params = {
  localFile: "some/local/file",

  s3Params: {
    Bucket: "s3 bucket name",
    Key: "some/remote/file",
    // other options supported by getObject
    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property
  },
};
var downloader = client.downloadFile(params);
downloader.on('error', function(err) {
  console.error("unable to download:", err.stack);
});
downloader.on('progress', function() {
  console.log("progress", downloader.progressAmount, downloader.progressTotal);
});
downloader.on('end', function() {
  console.log("done downloading");
});

Sync a directory to S3

var params = {
  localDir: "some/local/dir",
  deleteRemoved: true, // default false, whether to remove s3 objects
                       // that have no corresponding local file.

  s3Params: {
    Bucket: "s3 bucket name",
    Prefix: "some/remote/dir/",
    // other options supported by putObject, except Body and ContentLength.
    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
  },
};
var uploader = client.uploadDir(params);
uploader.on('error', function(err) {
  console.error("unable to sync:", err.stack);
});
uploader.on('progress', function() {
  console.log("progress", uploader.progressAmount, uploader.progressTotal);
});
uploader.on('end', function() {
  console.log("done uploading");
});

Tips

API Documentation

s3.AWS

This contains a reference to the aws-sdk module. It is a valid use case to use both this module and the lower level aws-sdk module in tandem.

s3.createClient(options)

Creates an S3 client.

options:

s3.getPublicUrl(bucket, key, [bucketLocation])

You can find out your bucket location programatically by using this API: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getBucketLocation-property

returns a string which looks like this:

https://s3.amazonaws.com/bucket/key

or maybe this if you are not in US Standard:

https://s3-eu-west-1.amazonaws.com/bucket/key

s3.getPublicUrlHttp(bucket, key)

Works for any region, and returns a string which looks like this:

http://bucket.s3.amazonaws.com/key

client.uploadFile(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property

params:

The difference between using AWS SDK putObject and this one:

Returns an EventEmitter with these properties:

And these events:

And these methods:

client.downloadFile(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

params:

The difference between using AWS SDK getObject and this one:

Returns an EventEmitter with these properties:

And these events:

client.downloadBuffer(s3Params)

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

The difference between using AWS SDK getObject and this one:

Returns an EventEmitter with these properties:

And these events:

client.downloadStream(s3Params)

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

The difference between using AWS SDK getObject and this one:

If you want retries, progress, or MD5 checking, you must code it yourself.

Returns a ReadableStream with these additional events:

client.listObjects(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#listObjects-property

params:

Note that if you set Delimiter in s3Params then you will get a list of objects and folders in the directory you specify. You probably do not want to set recursive to true at the same time as specifying a Delimiter because this will cause a request per directory. If you want all objects that share a prefix, leave the Delimiter option null or undefined.

Be sure that s3Params.Prefix ends with a trailing slash (/) unless you are requesting the top-level listing, in which case s3Params.Prefix should be empty string.

The difference between using AWS SDK listObjects and this one:

Returns an EventEmitter with these properties:

And these events:

And these methods:

client.deleteObjects(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#deleteObjects-property

s3Params are the same.

The difference between using AWS SDK deleteObjects and this one:

Returns an EventEmitter with these properties:

And these events:

client.uploadDir(params)

Syncs an entire directory to S3.

params:

function getS3Params(localFile, stat, callback) {
  // call callback like this:
  var err = new Error(...); // only if there is an error
  var s3Params = { // if there is no error
    ContentType: getMimeType(localFile), // just an example
  };
  // pass `null` for `s3Params` if you want to skip uploading this file.
  callback(err, s3Params);
}

Returns an EventEmitter with these properties:

And these events:

uploadDir works like this:

  1. Start listing all S3 objects for the target Prefix. S3 guarantees returned objects to be in sorted order.
  2. Meanwhile, recursively find all files in localDir.
  3. Once all local files are found, we sort them (the same way that S3 sorts).
  4. Next we iterate over the sorted local file list one at a time, computing MD5 sums.
  5. Now S3 object listing and MD5 sum computing are happening in parallel. As each operation progresses we compare both sorted lists side-by-side, iterating over them one at a time, uploading files whose MD5 sums don't match the remote object (or the remote object is missing), and, if deleteRemoved is set, deleting remote objects whose corresponding local files are missing.

client.downloadDir(params)

Syncs an entire directory from S3.

params:

function getS3Params(localFile, s3Object, callback) {
  // localFile is the destination path where the object will be written to
  // s3Object is same as one element in the `Contents` array from here:
  // http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#listObjects-property

  // call callback like this:
  var err = new Error(...); // only if there is an error
  var s3Params = { // if there is no error
    VersionId: "abcd", // just an example
  };
  // pass `null` for `s3Params` if you want to skip downloading this object.
  callback(err, s3Params);
}

Returns an EventEmitter with these properties:

And these events:

downloadDir works like this:

  1. Start listing all S3 objects for the target Prefix. S3 guarantees returned objects to be in sorted order.
  2. Meanwhile, recursively find all files in localDir.
  3. Once all local files are found, we sort them (the same way that S3 sorts).
  4. Next we iterate over the sorted local file list one at a time, computing MD5 sums.
  5. Now S3 object listing and MD5 sum computing are happening in parallel. As each operation progresses we compare both sorted lists side-by-side, iterating over them one at a time, downloading objects whose MD5 sums don't match the local file (or the local file is missing), and, if deleteRemoved is set, deleting local files whose corresponding objects are missing.

client.deleteDir(s3Params)

Deletes an entire directory on S3.

s3Params:

Returns an EventEmitter with these properties:

And these events:

deleteDir works like this:

  1. Start listing all objects in a bucket recursively. S3 returns 1000 objects per response.
  2. For each response that comes back with a list of objects in the bucket, immediately send a delete request for all of them.

client.copyObject(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#copyObject-property

s3Params are the same. Don't forget that CopySource must contain the source bucket name as well as the source key name.

The difference between using AWS SDK copyObject and this one:

Returns an EventEmitter with these events:

client.moveObject(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#copyObject-property

s3Params are the same. Don't forget that CopySource must contain the source bucket name as well as the source key name.

Under the hood, this uses copyObject and then deleteObjects only if the copy succeeded.

Returns an EventEmitter with these events:

Examples

Check if a file exists in S3

Using the AWS SDK, you can send a HEAD request, which will tell you if a file exists at Key.

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#headObject-property

var client = require('s3').createClient({ /* options */ });
client.s3.headObject({
  Bucket: 's3 bucket name',
  Key: 'some/remote/file'
}, function(err, data) {
  if (err) {
    // file does not exist (err.statusCode == 404)
    return;
  }
  // file exists
});

Testing

S3_KEY=<valid_s3_key> S3_SECRET=<valid_s3_secret> S3_BUCKET=<valid_s3_bucket> npm test

Tests upload and download large amounts of data to and from S3. The test timeout is set to 40 seconds because Internet connectivity waries wildly.