mapbox / dyno

simple dynamodb client
MIT License
78 stars 31 forks source link

Support custom AWS.Config instance via awsConfig option #135

Open knksmith57 opened 7 years ago

knksmith57 commented 7 years ago

First off, awesome work on this module. Y'all have written some high quality, well documented code and I'm excited to use it. Thank you very much for this excellent OSS contribution.

This PR adds support for a new option.awsConfig property and fulfills the AC from #133. It's a non-breaking change and was coded that way on purpose (more on this below).

In the future, I think it'd be best (and cleaner) to update the Dyno factory signature to make supplying AWS.Config a first-class citizen. The current implementation is a mask over all the available properties supported by AWS.DynamoDB with an additional, top-level table property and a required region property. This is problematic for 2 reasons:

  1. AWS might push out a new SDK at any time with new properties that developers might want access to tune and the current implementation wouldn't immediately support that.
  2. region shouldn't be required as an option to Dyno. The aws-js-sdk supports setting the region in a number of different ways (eg: via AWS_REGION environment variable), so there's no need (that I'm aware of?) to enforce it in Dyno.

I think a better signature would be Dyno(String table, AWS.Config config). It is possible to roll this as a non-breaking change by doing a # args / type check, but it feels dirty. Open to suggestions here. Depending on if y'all are willing to make a breaking API change, this may be something that would be useful to implement now with the type checks, deprecate the existing API, then hard break at the major bump. Just my $0.02.

Please let me know what you think! We'd really love to start using this but need fine-grain control over the SDK.

Thanks!

knksmith57 commented 7 years ago

@mcwhittemore / @rclark cool if I assign one of you (or both) as a reviewer + assignee for next steps? Thanks!

rclark commented 7 years ago

Thanks for the PR and for the very thoughtful comments!

I think a better signature would be Dyno(String table, AWS.Config config).

I tend to agree with you here. And if that's a direction that we're comfortable with, I would rather just jump on the breaking API change than role a minor bump slapping additional options in.

Could the second arg just be the options object that gets handed to AWS.Config constructor, instead of requiring the caller to instantiate?

knksmith57 commented 7 years ago

Stoked that you are on board with rolling an alternative signature (and thanks for the quick reply!).

It came to my attention while reviewing the codebase that there are 2 additional Dyno-specific parameters that can be specified via options: read, and write.

My gut says that, with this complication, it might be more appropriate to preserve the option hash but make it much more restrictive and require all AWS properties to be provided in an AWS.Config options hash.

Something like:

Dyno(table, {read, write, awsConfig = new AWS.Config()} = {})  

There's probably no harm in simply allowing devs to attach the read and write options to an AWS.Config instance (those properties aren't currently reserved in the AWS.DynamoDB constructor options and it'd be very uncharacteristic) but could be unnecessary risk.

Allowing an optional 3rd arguments is also a possibility:

Dyno(String table, AWS.Config config, [Object options])

Bottom line, anything that allows devs to provide the full set of AWS.Config options while preserving existing read / write-only behavior is my goal, and I'm happy to make the PR changes necessary to match your preference.

Thanks again!

mcwhittemore commented 7 years ago

If we're going to break this signature, I think we should talk about making table not required. There are a fair number of times where I want to use Dyno on many or zero tables. To get around this currently, I just toss, 'test' in for the table argument, but this always feels messy.

knksmith57 commented 7 years ago

I didn't even realize that was a use case. IOW, you throw table in as a param on each request?

Yeah, if table isn't strictly required to call the Dyno factory, I agree that it should definitely be optional.

Dyno(Object options, [AWS.Config config]) is looking like a winner now.

rclark commented 7 years ago

OTOH my experience is that 90% of the time I use dyno, I'm operating against just one table. This was the motivation for adding the simple table parameter, and requiring it simplifies thinking in some applications.

@knksmith57 I am still curious why we would ever ask the user to provide a configured AWS.Config instance instead of just a vanilla options object?

knksmith57 commented 7 years ago

@rclark there's really no point in enforcing an actual instance of AWS.Config be provided. My concern is that devs should understand that the config (whether pojo or AWS.Config instance) will be proxied to the DynamoDB constructor and that the constructor accepts an options hash that is shaped by the AWS.Config class.

The only differences that I see by introspecting the instance are some internal class definitions:

> const AWS = require('aws-sdk')
> let config = new AWS.Config()
Config {
  credentials:
   SharedIniFileCredentials {
     expired: false,
     expireTime: null,
     accessKeyId: 'AKIAFOOBARSOMEACCESSKEY',
     sessionToken: undefined,
     filename: '/Users/someuser/.aws/credentials',
     profile: 'my-profile',
     disableAssumeRole: true },
  credentialProvider:
   CredentialProviderChain {
     providers: [ [Function], [Function], [Function], [Function] ] },
  region: undefined,
  logger: null,
  apiVersions: {},
  apiVersion: null,
  endpoint: undefined,
  httpOptions: { timeout: 120000 },
  maxRetries: undefined,
  maxRedirects: 10,
  paramValidation: true,
  sslEnabled: true,
  s3ForcePathStyle: false,
  s3BucketEndpoint: false,
  s3DisableBodySigning: true,
  computeChecksums: true,
  convertResponseTypes: true,
  correctClockSkew: false,
  customUserAgent: null,
  dynamoDbCrc32: true,
  systemClockOffset: 0,
  signatureVersion: null,
  signatureCache: true,
  retryDelayOptions: { base: 100 },
  useAccelerateEndpoint: false }

So, to more directly answer your question:

I am still curious why we would ever ask the user to provide a configured AWS.Config instance instead of just a vanilla options object?

I don't think there's any sense in requiring it. It may, however, be much more desirable by devs familiar with the SDK to pass one in.

Does my position make sense / does the explanation seem sound?

rclark commented 7 years ago

It may, however, be much more desirable by devs familiar with the SDK to pass one in.

🤷‍♂️ I've never actually instantiated a AWS.Config object explicitly myself -- I always pass a JS object to a service client constructor.

Gonna talk this over a bit more with @mcwhittemore and see what comes out of that convo. Will get back here in the next day or two. @knksmith57 are you blocked or anything here? We have to be cautious about breaking the API and it probably won't happen quick.

knksmith57 commented 7 years ago

Awesome, thanks for your consideration and transparency through all of this.

I'm blocked from switching to dyno on this feature, yes. We tune a few of the unreachable parameters for our application and the common pattern in the codebase is to reuse a configured AWS.Config instance. Note that, as discussed, that doesn't matter so long as the configuration is proxied to the SDK service client constructors.

Notably, we use bluebird for the promise dependency and, in some environments, set region using an environment variable instead of programmatically in app code.

I'd rather we do this right then rush it for me. But I'm also of the mindset that there isn't a single right solution to this. Just preferences for the interface.

Thanks and please let me know how I can help!

mcwhittemore commented 7 years ago

Could the signature just be Dyno(writeOpts, readOpts) where readOpts are optional and if they are not provided writeOpts is used for read and write and where writeOpts / readOpts are the AWS.Config object.

mcwhittemore commented 7 years ago

Meh. Thinking this over more, I kind of disagree with just accepting the AWS.config. It will lead to a bunch of checking to make sure the configs are like in the ways we need them to be (ie, params.TableName) and will require users to write more {} for little gain.

knksmith57 commented 7 years ago

Agreed. I think making writeOpts/readOpts instances of AWS.Config is a can of worms, since then supplying custom options for Dyno for each would (likely) involve overloading the AWS.Config object.

The idea of having the signature be Dyno(Object writeOpts, [Object readOpts]) is interesting though. Making awsConfig (or whatever it'd be called) a sub-key on each is particularly useful for the read-from-one-region / table, write-to-another scenario because it'd allow configuring totally separate settings for each (a big part of the discussion starting in #136).

mcwhittemore commented 7 years ago

@knksmith57 - In talking with @rclark about this, we're going to keep the read/write double config in Dyno.multi. For your original proposal, we should keep using one argument but support this argument as an AWS.ConfigService instance.

If it is not an AWS config service:

Does this seem right to you?

Can you take a stab at updating this PR? Once it's ready, I can make sure it gets published quickly.

knksmith57 commented 7 years ago

@mcwhittemore to ensure I understand your request correctly, you're saying the new signature should be:

Dyno(Object|AWS.ConfigService options)

Is that right?

Also, I don't follow exactly what you mean by:

we should check that the needed attributes are on the object and set them to defaults if they are not set.

Are any attributes actually needed?

If the idea here is to overload the Dyno factory to accept either a Dyno options hash or an instance of AWS.ConfigService, then that doesn't feel right. The two aren't mutually exclusive. One might want a read-only client that uses a custom AWS configuration (for example).

As a side note, it occurs to me that simply accepting an AWS.ConfigService instance and passing it into the DynamoDB constructor isn't sufficient for configuring global options (like the promise dependency). The actual AWS module would have to be exposed to allow that (ie: how dynamoose handles this)

mcwhittemore commented 7 years ago

One might want a read-only client that uses a custom AWS configuration (for example).

We don't support a read-only SDK, with Dyno.multi you can have one config read and another write, but getting into this is past the scope of what Dyno is trying to achieve.

The two aren't mutually exclusive.

Is this true? What options does Dyno currently support that you can't get by configuring AWS.Config?

Are any attributes actually needed?

One thing Dyno does is change the default timeout, so while no attributes are needed by AWS, Dyno should always provide this timeout. So Dyno() should equal Dyno({httpOptions:{ timeout: 5000}).