moj-analytical-services / Rdbtools

Accessing Athena on the Analytical Platform
Other
4 stars 0 forks source link

Write tables #7

Closed pjrh-moj closed 2 years ago

pjrh-moj commented 2 years ago

Hi @mratford and @ymao2 ,

I'm just trying to tidy up Rdbtools a bit as more people are using it. The last bit of functionality I haven't got working is to write data to Athena - I don't know if this is a permissions issue?

The code currently lets you create a temporary database, and create new tables in it from existing tables on Athena (e.g. CREATE TABLE AS ) but not using the dbWriteTable function from noctua. I'd like to get dbWriteTable - see my attempts in the branch table_write.

I should be able to do:

library(tidyverse)
test_table <- tibble(col1 = c(1,2,3,4), col2 = c("a","b","c","d"))
con <- connect_athena()
dbWriteTable(con, "__temp__.tab_test", test_table)

However I try this, I get: Error: AccessDenied (HTTP 403). Access Denied.

The reason I want this is twofold

Just wondering if either of you could shed some light on this!

pjrh-moj commented 2 years ago

I've has this in the back of my head all day, and I think I've understood where the issue is. It seems like as long as I specify the s3.location manually then I can upload a table - suggesting that the permissions for the s3_staging_dir which is generated for us doesn't give write permissions.

E.g., this works: dbWriteTable(con, "__temp__.dbwrite2", df, s3.location = "s3://bucket_where_I_have_access")

So we have two options:

ymao2 commented 2 years ago

@pjrh-moj , Thank you for adding me into this thread and the update for this issue, thinking about to reply you yesterday but was busy. tbh, based on my knowledge to Athena and R, I am not going to be useful for this issue at this stage, maybe later once I have time to dive in 😳, just based on your finding and comment, my comment would be

If it is the main reason, I suggest to speak to AP, e.g. setup separate folder with proper permission of parent folder for temporary table only, then we can not only to justify our finding, also you may be able to do something for switching the staging dir in the code?

pjrh-moj commented 2 years ago

No problem @ymao2 - I think there's a combination of technical Athena questions, but also how users will use this is practise so keen to get your view on this as well.

I agree that changing permissions is an AP team issue, not something we can do ourselves. So the second option becomes the obvious route to pursue, and I've done this in PR #8. Actually, in the end it's mainly a case of updating the documentation as it was something that was already possible in the code so a fairly straightforward fix. It would be useful if you wanted to have a look at the code in that PR and see if the instructions are clear enough and then we can merge it in?

I'll ask the AP team why we don't have write permission to the staging dir which gets created for us - I suspect it's an issue that they have a good reason for and we can't easily change. Given we've got a working method now and have documented why it is, then I guess we don't have such a strong need for any changes at their end - it might be easiest to see how it goes as is first anyway.