Open cholu6768 opened 1 year ago
Before @isabelizimm gets back from vacation next week, I just want to assure you that absolutely you should be able to roundtrip a pin through R and Python on S3, Posit Connect, etc.
If you navigate to the S3 bucket and check out the folder where these pins are stored, do you see a metadata file? It should be in the version folder, called data.txt
, and look somewhat like this. If there is one there, can you share what it looks like?
Hi Julia,
I have two pins saved:
This one is a CSV
List of 11
$ file : chr "test_data_starwars.csv"
$ file_size : 'fs_bytes' int 7.33K
$ pin_hash : chr "735f8120e142c45f"
$ type : chr "csv"
$ title : chr "test_data_starwars: a pinned 87 x 11 data frame"
$ description: NULL
$ created : POSIXct[1:1], format: "2023-09-29 13:18:00"
$ api_version: num 1
$ user : list()
$ name : chr "test_data_starwars"
$ local :List of 3
..$ dir : 'fs_path' chr "~/.cache/pins/s3-mydatabase/test_data_starwars/20230929T131826Z-735f8"
..$ url : NULL
..$ version: chr "20230929T131826Z-735f8
This one is RDS, but this one can't be read in Python since it's a binary R data file. I also wanted to ask if it's possible to save it as parquet with R and then read it with Python.
$ file : chr "encrypted_data.rds"
$ file_size : 'fs_bytes' int 1.54G
$ pin_hash : chr "bd32445d57729d9a"
$ type : chr "rds"
$ title : chr "encrypted_data: a pinned 5667283 x 4 data frame"
$ description: NULL
$ created : POSIXct[1:1], format: "2023-09-01 16:00:00"
$ api_version: num 1
$ user : list()
$ name : chr "encrypted_data"
$ local :List of 3
..$ dir : 'fs_path' chr "~/.cache/pins/s3-mydatabase/encrypted_data/20230901T160026Z-bd324"
..$ url : NULL
..$ version: chr "20230901T160026Z-bd324"
The error sounds like it it having a hard time reading the metadata so I wanted to doublecheck the metadata is there and in the correct format. For that CSV pin, do you see the data.txt
file when you navigate to the S3 web page for this bucket? For example, here is what I see for a pin I have saved in S3:
I have navigated to a version folder, and then I see the data.txt
YAML file plus the pin contents. Do you see something similar when you go to your S3 bucket? What does the contents of the YAML look like? It should be something like this:
file: really-pretty-numbers.json
file_size: 23
pin_hash: c3943ca5a9aab2df
type: json
title: 'really-pretty-numbers: a pinned integer vector'
description: ~
created: 20221103T022316Z
api_version: 1.0
Ohh, I thought by doing pin_meta() you would also get the information you need.
I checked the S3 bucket through the console and the file is there but I don’t have permission to see the txt file. I’m gonna ask the admin to give me access and I’ll get back to you :)
Ah, I definitely think that would be the problem if you don't have permissions to access the file. A user who wants to read a pin needs to have permissions to access the directory where the pin contents plus metadata are stored.
Hey there! I am back from OOO and hopping in to confirm @juliasilge-- you won't be able to see any pin metadata without access. Let us know if getting permissions sorted out fixes your problem 😄
Thank you both ;)
In the meantime while I wait to get read access from the admin, I managed to copy the txt file to my local path. This is how the txt looks like:
file: test_data_starwars.csv
file_size: 7507
pin_hash: 735f8120e142c45f
type: csv
title: 'test_data_starwars: a pinned 87 x 11 data frame'
description: ~
created: 20230929T131554Z
api_version: 1.0
I have another question, how come I can read the pin with R but not with Python? does R read the metadata in a different way than Python?
That data.txt
file looks right. 👍
how come I can read the pin with R but not with Python?
I have been wondering the same thing!!! 🤯 It must be some difference in how the R package and Python package do authentication? In terms of reading the file itself:
get_object()
method on a paws.storage::s3()
object:open()
method.Hmmm.... I wonder if your PATH_TO_MY_BOARD
is not correct? For example, if I have a bucket named pins-testing
and then a pin inside the board called starwars-data-test
, my board setup/call would look like ⬇️
board = pins.board_s3("pins-testing")
board.pin_read("starwars-data-test")
One way the error you see can manifest is if you've added too much information in your board creation, eg
board = pins.board_s3("pins-testing/starwars-data-test") # this will give a PinsError
board.pin_read("starwars-data-test")
A way to check if you've added the path you expect is if you run board.pin_list()
-- if you have added a path that is too deep into your bucket, you'll see a list of hashes (which are the versions), rather than the name of the pin.
First of all, thank you for looking at my question :)
I have the following issue. I'm able to access the board and also list which pins are inside:
But when I try to read a pin
I get the following error:
I also want to note that the board and the pin was initially created with R, could that be an issue?