DyfanJones / noctua

Connect R to Athena using paws SDK (DBI Interface)
https://dyfanjones.github.io/noctua/
Other
45 stars 5 forks source link

[Question]: Requesting guidance and best practices - Athena shinyApp with noctua #212

Open tamuanand opened 9 months ago

tamuanand commented 9 months ago

Hi @DyfanJones

Thanks for this great package.

I am planning on deploying a shinyApp querying Athena using noctua and in this regard, I wanted to seek your guidance and best practices especially around how to use/pass the AWS credentials via the shinyApp.

Given the above, what would be the recommended way to send the queries to Athena using noctua - do you have any toy examples that I can use to play around?

con <- DBI::dbConnect(
  odbc::odbc(),
  Driver             = "[your driver's name]",
  S3OutputLocation   = "[your S3 bucket]",
  AwsRegion          = "[your AWS region bucket]",
  AuthenticationType = "IAM Credentials",
  Schema             = "[your schema's name]",
  UID                = rstudioapi::askForPassword("AWS Access Key"),
  PWD                = rstudioapi::askForPassword("AWS Secret Key")
  )

I am guessing that the odbc::odbc() section could be replaced by noctua::athena()

Any guidance/tips will be very helpful and appreciated.

Thanks in advance.

DyfanJones commented 8 months ago

Sorry for the delay on this. I have was fixing a cran issue for paws.common. Here are the accepted parameters for dbConnect when working with noctua: https://dyfanjones.github.io/noctua/reference/dbConnect-AthenaDriver-method.html

noctua should be able to do something fairly similar to odbc:

con <- DBI::dbConnect(
  noctua::athena(),
  s3_staging_dir  = "[your S3 bucket]",
  region_name = "[your AWS region bucket]",
  schema_name = "[your schema's name]",
  aws_access_key_id  = rstudioapi::askForPassword("AWS Access Key"),
  aws_secret_access_key = rstudioapi::askForPassword("AWS Secret Key")
)

You can also turn off rstudio's connection tab, as this slows down the connection.

con <- DBI::dbConnect(
  noctua::athena(),
  s3_staging_dir  = "[your S3 bucket]",
  region_name = "[your AWS region bucket]",
  schema_name = "[your schema's name]",
  aws_access_key_id  = rstudioapi::askForPassword("AWS Access Key"),
  aws_secret_access_key = rstudioapi::askForPassword("AWS Secret Key"),
  rstudio_conn_tab = FALSE
)

If you have any common characteristic between your user base you can set them as environmental variables or in the .aws/config file for example. I.e. they are all using the same AWS Account and thus using the same AWS Region.

region_name would usually stay static (environmental variable: AWS_REGION) s3_staging_dir would usually stay static (environmental variable: AWS_ATHENA_S3_STAGING_DIR)

con <- DBI::dbConnect(
  noctua::athena(),
  schema_name = "[your schema's name]",
  aws_access_key_id  = rstudioapi::askForPassword("AWS Access Key"),
  aws_secret_access_key = rstudioapi::askForPassword("AWS Secret Key"),
  rstudio_conn_tab = FALSE
)

Another alternative is if you know the Athena work group you are using you could set your s3_staging_dir there and noctua would use that instead.

con <- DBI::dbConnect(
  noctua::athena(),
  schema_name = "[your schema's name]",
  work_group = "[your work group]",
  aws_access_key_id  = rstudioapi::askForPassword("AWS Access Key"),
  aws_secret_access_key = rstudioapi::askForPassword("AWS Secret Key"),
  rstudio_conn_tab = FALSE
)

I hope this helps.

tamuanand commented 8 months ago

Thanks @DyfanJones for all the great tips.

Some follow-up questions:

  1. Do I need to install noctua and its dependencies on the shiny server?
  2. Do I need any Posit Professional Drivers etc - as mentioned here- https://solutions.posit.co/connections/db/best-practices/drivers/
  3. Do you have some kind of vignette or a tutorial

Thanks