Closed cnemarich closed 2 years ago
@cnemarich Looking at this initially, it seems like it would work best to pass this through the dots from pin_write()
to pin_store()
to s3_upload_file()
to the paws put_object
method. That would look like this:
board_sales <- board_s3("company-pins", prefix = "sales/")
board_sales %>% pin_write(mtcars, Tagging = "key1=value1&key2=value2")
@machow It looks like the Python board_s3()
doesn't support tags right now either. What do you think about putting it as an argument to pin_write()
like this? Is there a different way that would work better for Python?
Hey, thanks for looking through this. Passing through Tagging makes sense to me.
On the python side, we'll need to figure out if we want to subclass BaseBoard, which is not s3 specific, or make load_data a generic function that dispatches on the filesystem. (But should be quick to sort out).
Here's what getting and setting tags looks like in s3fs (the fsspec implementation for s3)
# from the s3fs tests
def test_tags(s3):
tagset = {'tag1': 'value1', 'tag2': 'value2'}
fname = list(files)[0]
s3.touch(fname)
s3.put_tags(fname, tagset)
assert s3.get_tags(fname) == tagset
@machow Let's plan for adding this feature after conf 👍
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Question: Is there any way to upload objects to an S3 board with tags? If this is not possible, can it be added as a feature?
Use Case: My organization currently uses the {paws} package to upload shared datasets to an S3 bucket. This data is then used by various team members. However, it is also then read and ingested into other systems automatically based on what tags the object has. We would like to move to using the {pins} package for managing these shared datasets, but would need to be able to retain this ability to write tags to objects as they're being uploaded.
Additional Notes: The
put_object
method from the {paws} package that {pins} relies on allows tags to be passed along with uploaded objects. I took a look at thepins_write
function and, while it does take additional parameters, none of these arguments seem to be passed on to thes3_upload_file
helper function: https://github.com/rstudio/pins/blob/baaa304f876ec9cc98cc90d3333db157e34028f7/R/board_s3.R#L210-L220 https://github.com/rstudio/pins/blob/baaa304f876ec9cc98cc90d3333db157e34028f7/R/board_s3.R#L270-L277