aws / aws-sdk-ruby-record

Official repository for the aws-record gem, an abstraction for Amazon DynamoDB.
Apache License 2.0
318 stars 41 forks source link

If you forget to create a schema, weird error messages ensue #54

Open uberbrady opened 7 years ago

uberbrady commented 7 years ago

I am a stupid idiot, and I forgot to add hash_key: true to the actual, uh, hash_key. This is a dumb thing that dumb people do and I lost hours from it due to my own stupidity.

However, the library never told me "Hey, idiot, I don't know how to handle your aws-record object because you never told me what your actual keys were." That would've been nice :(

I suspect, also, that you can probably inadvertently specify two hash_key's, and the library won't care.

Ideally, when the Class is first created might be the best time to check for this, but I think you might not ever get a signal that 'oh, yeah, the user is finished defining their class'.

So instead, maybe we could put something in the save() method that, when it's walking through the keys to see if any of them have changed (to see if this is a 'new record' or an 'update'), we can use the following algorithm:

1) Check to see if a temporary variable (that was initialized to false) is true, if so skip this check. Something like schema_checked? 2) Otherwise, we look for exactly-only-one hash key, and optionally, perhaps only-one range key. More than one range key or not exactly one hash key should throw an exception, something like Aws::Record::SchemaNotSpecified or Aws::Record::SchemaInvalid, ideally with some helpful further information like "No Hash Key defined" or "More than one Hash Key defined" or "More than one Range Key Defined". 3) Set schema_checked? to true, so this code doesn't get run again.

Perhaps that optimization isn't necessary, but lots of people who use DynamoDB use it for performance reasons, and so if they're really beating up on the DB, then repeatedly walking through the defined schema seems wasteful.

awood45 commented 7 years ago

I agree that there's room to make this clear - you're right that the reason we don't have this yet is the lack of a clear way to say "this schema is complete" (so we wait for you to use it), but some issues like multiple :hash_key definitions should be able to raise right away (and don't).

I'm open to a PR for that schema check, and I think the ability to define multiple hash keys is a bug.

awood45 commented 7 years ago

I should note we already have the model_valid? method which you can call on a model class, and will raise if you have any issues. So, it's a discoverability question rather than a feature request perhaps.

The duplicate definition of hash keys without a clear resolution still concerns me at bug-level, though.

awood45 commented 7 years ago

model_valid? in the documentation. Notably, I should add a docstring for that.

uberbrady commented 7 years ago

Well, if you like my idea about a cached check at save()-time, I can call model_valid? then (only the first time, all subsequent times it should just breeze through). And if someone has called two hash_key's, I can call that out too, as well, during that same check (perhaps also via the call to model_valid?)

Let me know if you'd be interested in a PR; I'd be happy to put one together for you.

uberbrady commented 7 years ago

or another option; we could do it on the first .new()?

awood45 commented 7 years ago

I'm open to a more user friendly error in those cases, yes. I'm inclined to think that duplicate invocation of the :hash_key parameter could be called out as an error immediately, but I'm not positive about the best way to push back on that mistake. In either case those would have to be separate PRs.

awood45 commented 7 years ago

Rolling this as something to look at when we rev the major version of the library for the modularized SDK.

mullermp commented 2 years ago

Adding a feature request to update model_valid? documentation and raise when a second hash key is declared.