paws-r / paws

Paws, a package for Amazon Web Services in R
https://www.paws-r-sdk.com
Other
313 stars 37 forks source link

How to specify the data types for items passed to the paws::dynamodb()$batch_write_item function? #576

Closed tomsing1 closed 1 year ago

tomsing1 commented 1 year ago

I am new to interacting with dynamodb through paws, so apologies if this should have been obvious. To learn how to work with dynamodb, I am working my way through AWS' getting started script for the python SDK.

The python example uses the batch_writer boto3 method. I believe that (eventually) hits the same endpoint as paws' dynamodb()$batch_write_item function.

The boto3 documentation shows that items can be specified as key:value pairs, without the data type explicitly stated. For example:

with table.batch_writer() as batch:
    batch.put_item(
        Item={
            'account_type': 'standard_user',
            'username': 'johndoe',
            'first_name': 'John',
            'last_name': 'Doe',
            'age': 25,
            'address': {
                'road': '1 Jefferson Street',
                'city': 'Los Angeles',
                'state': 'CA',
                'zipcode': 90001
            }
        }
    )

The paws R example includes the data type for each attribute, e.g.

svc$batch_write_item(
  RequestItems = list(
    Music = list(
      list(
        PutRequest = list(
          Item = list(
            AlbumTitle = list(
              S = "Somewhat Famous"
            ),
            Artist = list(
              S = "No One You Know"
            ),
            SongTitle = list(
              S = "Call Me Today"
            )
          )
        )
      )
    )
  )
)

The boto3 method seems easier to use, as it accepts a (potentially nested) item without requiring the user to determine the data type for each attribute. Perhaps the boto3 implementation guesses the data type? 🤔

Is there a way to mimic boto3's behavior in paws? Or is the user required to deduce the data type for each element of (a potentially highly nested) item themself, e.g. by writing a helper function that traverses the (potentially nested) item?

Many thanks for any pointers!

DyfanJones commented 1 year ago

Hi @tomsing1, paws usually mimics the lower level of boto3 (client methods) for a like to like comparison paws svc$batch_write_item is equivalent to boto3.client('dynamodb').batch_write_item. As you pointed out the resource method does have a wrapper around the client method boto3.client('dynamodb').batch_write_item (https://github.com/boto/boto3/blob/master/boto3/dynamodb/table.py).

tomsing1 commented 1 year ago

Thanks a lot for the pointers, @DyfanJones , and for pointing out that paws is aimed at the lower level functionality. Much appreciated. It seems that in boto3, (higher level) transformers are used to translate between python and dynamodb types.

Sounds like that's what I need for my use case as well. Out of curiosity - and because I suspect other users might have solved this problem already - are you aware of examples in R that perform the same functionality?

DyfanJones commented 1 year ago

It looks like a previous user has wrote these 2 blogs around uploading data to dynamodb

Sqdly it doesnt take into account list, map or array types. However it is a good guide in getting started with dynamodb.

DyfanJones commented 1 year ago

Side note it would be cool to have a nodbi interface to dynamodb, maybe an extension of https://github.com/ropensci/nodbi or a package in its own right similar to https://github.com/DyfanJones/noctua.

tomsing1 commented 1 year ago

Thanks a lot for taking the time to point out these resources, I really appreciate it! I will explore nodbi, and I agree - a higher level package like noctua would be awesome. Unfortunately I am very much a newby when it comes to dynamodb, so writing it is beyond my current abilities. If anybody else needs help, e.g. doorknobs to be polished, vignettes to test, etc I am more than happy to!