brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
604 stars 228 forks source link

Provide nicer API for building DynamoDB AttributeValue/items #263

Closed axman6 closed 3 years ago

axman6 commented 8 years ago

Currently it's a bit of a pain to create an item to send to DynamoDB. It should be easy to build a very Aeson like interface to building objects which enforces the semantics AWS expects. At the moment it's easy to (though unwise) create an object where both the S and say the SS keys are set through the lens interface.

I would also love to see an Aeson like ToAttributeValue and FromAttributeValue classes for creating and parsing using this interface.

I'm unsure if this sort of thing would be in scope for amazon(-dynamodb) or whether it should be third party library, so happy to discuss here. I may be able to implement such a library, either within amazonka-dynamodb or as a new package if you think it's worthwhile.

brendanhay commented 8 years ago

Indeed, I have lost count of the number of times I've written an interface similar to this.

I have toyed with the idea of writing a separate library to provide a sound de/serialisation interface, but haven't yet got around tidying it up and releasing it.

axman6 commented 8 years ago

Hmm, looks interesting, though I'm unsure how well that would work for more nested structures. I was thinking something along the lines of:


type AttrMap = HashMap Text AttrValue

data AttrValue
    = B     !ByteString
    | N     !Scientific
    | S     !Text
    | L     !(Vector AttrValue)
    | M     !AttrMap
    | NS    !(Set Scientific)
    | BS    !(Set ByteString)
    | SS    !(Set Text)
    | BOOL  !Bool

class ToAttributeValue a where
    toAttrValue  :: a -> AttrValue
    toAttrObject :: a -> AttrMap
    primKey :: Proxy a -> (Text, Maybe Text)

(haven't thought too deeply about the API, but wanted something simple and easy for others to implement without needing to know much about lenses). I'd propose something like a Network.AWS.DynamoDB.Simple module for this.

By the way, I'm incredibly impressed with your work on Amazonka, it's amazing how comprehensive it is, I'm really looking forward to putting it into production if we can.

pkamenarsky commented 8 years ago

What about just defining a sane FromJSON/ToJSON instance, similar to this:

valueToAttributeValue :: JSON.Value -> AttributeValue
valueToAttributeValue (JSON.String v)  = attributeValue & avS .~ Just v
valueToAttributeValue (JSON.Number v)  = attributeValue & avN .~ Just (T.pack $ show v)
valueToAttributeValue (JSON.Bool v)    = attributeValue & avBOOL .~ Just v
valueToAttributeValue (JSON.Array vs)  = attributeValue & avL .~ fmap valueToAttributeValue (V.toList vs)
valueToAttributeValue (JSON.Object v)  = attributeValue & avM .~ fmap valueToAttributeValue v
valueToAttributeValue JSON.Null        = attributeValue & avNULL .~ Just True

attributeValueToValue :: AttributeValue -> JSON.Value
attributeValueToValue av
  | Just v <- av ^. avS = JSON.String v
  | Just v <- av ^. avN = JSON.Number $ read $ T.unpack v
  | Just v <- av ^. avBOOL = JSON.Bool v
  | v <- av ^. avM, not (H.null v) = JSON.Object $ fmap attributeValueToValue v
  | vs <- av ^. avL = JSON.Array $ fmap attributeValueToValue (V.fromList vs)
  | Just _ <- av ^. avNULL = JSON.Null
  | otherwise = JSON.Null
axman6 commented 8 years ago

On first examination that looks pretty nice to me (would be particularly awesome if lensified to convert between the two formats).

brendanhay commented 8 years ago

FYI I've been spending some time on this. There is currently a WIP branch with some of the work, which I will document and offer up for RFC soon.

axman6 commented 8 years ago

Wow, that's looking amazing, I'll need to find another project to play with this on now!

It looks like a lot of the type functions you're using could be taken from the type-lists package (https://hackage.haskell.org/package/type-list-0.5.0.0/docs/Data-Type-List.html) such as type family (∈) x xs == Find x xs (though the syntax for yours is nicer - perhaps that could be added to type-lists), while some of them are useful type functions which should be in that package (NotIntersecting).

Can you add a full example of a not too trivial type instantiating the relevant type classes, how to actually work with the serialisation/deserialisation?

Yet again, really awesome stuff, I wish I wrote Haskell this well.

axman6 commented 8 years ago

Might also be nice to have an infix version of attr too... though Haskell's infix namespace is already pretty cluttered (trying to work with Aeson and Lens and Cassava at the same time is a nightmare with their conflicting infix ops)

brendanhay commented 8 years ago

It looks like a lot of the type functions you're using could be taken from the type-lists package

type-lists >= ~0.2 has a lot of dependencies that somewhat ironically amazonka-* doesn't already pull in:

So for naive set/list membership or list append I've prefered rolling my own.

There is a dependency on type-level-sets to check for duplicate attributes in the schema type - but again custom type families have been used when the set's Nub (Sort ...) invariants don't need to be respected, to try avoid Sorts quadratic compilation (type-checking) time.

I'm also using GHC.TypeLits TypeError a lot to propagate nice compilation errors to the user rather than a simple failing constraint, and this introduces alot of boilerplate around the list/set-like families if I can't specialise them somewhat (Constraint kinds, vs Bool ~ 'True etc.)

Might also be nice to have an infix version of attr too

Your point about the infix namespace being cluttered was why I haven't (yet) written any operators .. I sort of vainly hope that attr is short enough to avoid the need. If there are some sane unused operators that won't contest the same namespaces as other serialisation libraries, I'll certainly look into adding them.

Can you add a full example of a not too trivial type instantiating the relevant type classes, how to actually work with the serialisation/deserialisation?

A simple data-type that we want to de/serialize is declared:

data Credentials = Credentials
    { _name     :: Text
    , _version  :: Integer
    , _revision :: ByteString
    , _contents :: ByteString
    }

Firstly, we can use the aeson-style DynamoItem instance to create a HashMap Text AttributeValue that could be used with PutItem or GetItem:

instance DynamoItem Credentials where
   toItem Credentials{..} =
       item [ attr "name"     _name
            , attr "version"  _version
            , attr "revision" _revision
            , attr "contents" _contents
            ]

   fromItem m =
       Foo <$> parse "name"
           <*> parse "version"
           <*> parse "revision"
           <*> parse "contents"

encode (cred :: Credentials)                 :: HashMap Text AttributeValue
decode (hmap :: HashMap Text AttributeValue) :: Either ItemError Credentials

The library internally uses an opaque Value wrapping AttributeValue, to maintain the invariant that only a single AttributeValue type is set. encode and decode are then used to convert the input/output from DynamoItem to a amazonka-dynamodb usable type/value.

So this is a completely fine way to use the library, if you don't mind repeating string key names (this style, and the following are both amenable to automation via TH/Generics) and the Credentials type above corresponds 1:1 with a DynamoDB table - or you assemble the components of an item by using encode a <> encode b (HashMap's monoid instance).

You will still have to repeat the string keys in Scan or Query operations, and likewise with key projections and indexes.

In an attempt to improve on this, I've introduced the idea of a table schema which is detached from the actual data types being de/serialized, the current version of the schema DSL is shown below (syntax in flux):


type Example =
    Table "credentials"
        ( PartitionKey "name"     Text
       :# SortKey      "version"  Integer
       :# Attribute    "revision" Text
       :# Attribute    "contents" Text
        )

        ( Throughput (ReadCapacity 1) (WriteCapacity 1)

       :# Stream 'SVTKeysOnly

       :# GlobalSecondaryIndex "revision"
             ( IndexPartitionKey "name"
            :# IndexSortKey      "revision"
            :# Throughput (ReadCapacity 1) (WriteCapacity 1)
            :# Project 'All
             )

       :# LocalSecondaryIndex "version"
             ( IndexSortKey "contents"
            :# Project 'KeysOnly
             )
        )

example :: Proxy Example
example = Proxy

Table is defined as Table (name :: Symbol) keys options, where the keys are a non-empty set of attributes (hence :#). The options parameter is then additional table configuration such as indexes which keys etc. are type-checked against the original table keys.

Some examples of what this gives you currently are:

instance DynamoItem Credentials where
    toItem Credentials{..} =
        serialize example _name _version _revision _contents

    fromItem = fmap unpack . deserialize example
       where
         -- This pattern match on ':*:' only exists because of the
         -- current lack of a more familiar 'Applicative' interface:
         unpack ( _name
              :*: _version
              :*: _revision
              :*: _contents
                ) = Credentials{..}

-- Obtain a 'CreateTable' request corresponding to the 'Example' schema:
getTable example :: CreateTable

-- 'getTable' is implemented in terms of:
getLocalIndexes  example :: [LocalSecondaryIndex]
getGlobalIndexes example :: [GlobalSecondaryIndex]
getAttributes    example :: NonEmpty AttributeDefinition
getKeys          example :: NonEmpty KeySchemaElement
getThroughput    example :: Maybe ProvisionThroughput

As you can see from the DynamoItem instance, it hasn't saved much in terms of syntax, but the string keys are no longer mentioned and are handled by the schema, and subject to type-checking.

What I haven't yet got around to, is to also provide the same serialize and deserialize functionality for a schema's indexes and the projected keys.

This might? provide a nice way to formulate Query requests in particular.

So caveat emptor, as it currently stands I'd like to ship DynamoItem and DynamoValue at some point, the schema design is still under exploration and constructive criticism is appreciated.

ekalosak commented 5 years ago

Are there any updates on this issue? Was anyone able to find a solution to mapping an Aeson Value Object to an Amazonka-dynamodb AttributeValue?

freefrancisco commented 4 years ago

What are people using amazonka-dynamodb today doing to address this issue? Is the recommended approach for now to roll your own mapper between AttributeValue and Aeson Value? I'm working with the library right now, and trying to figure out what is the best approach to this.

axman6 commented 4 years ago

We recently open sourced https://github.com/tmortiboy/amazonka-extras/blob/master/aws-dynamodb/src/Network/AWS/DynamoDB/AttributeValue.hs which probably it's completely ready for releasing, but might be useful nonetheless. We're using it a lot where I work and it's great. It does however rely on our text1 package (which does mean you're saved from the dreaded empty Text problem with DynamoDB). @tmortiboy ping.

endgame commented 3 years ago

In my experience with DynamoDB, I almost never want to do mapping directly between an entire item and some Haskell type. Instead, I find myself wanting to do things like:

We tried to build a "table mapper" style API at my work, too, but in my experience the abstraction almost always gets in the way. A type-level description of the table similar to @brendanhay 's comment in https://github.com/brendanhay/amazonka/issues/263#issuecomment-242677714 could be very cool, and hints towards something like "opaleye for postgres". But that should be an experimental branch or PR, or even a separate package. This has been open since 2016, I think amazonka has shown itself to be enough of a maintenance handful just being the base layer of autogenerated interfaces that I'm going to close it.