rstudio / pins-r

Pin, discover, and share resources
https://pins.rstudio.com
Other
312 stars 63 forks source link

Failure to record all pins version in data.txt #480

Closed amashadihossein closed 3 years ago

amashadihossein commented 3 years ago

Note: both datasets remain available on S3 and structured properly under _version/ path, but data.txt only shows the latest. In other word, the issue is with data.txt keeping accurate records.

To repro:


# Step 1: start with default cache and pin into S3 cars[1:5, ]
#--------------------------------------------------------------------------------------------
library(pins)

board_register_s3(name = "d1",
                  bucket = "pinsbucket", 
                  key = aws.signature::locate_credentials(profile = "xxxxx")$key, 
                  secret = aws.signature::locate_credentials(profile = "xxxxx")$secret,
                  region = "us-east-2" , versions = T )

pins::pin(x = cars[1:5, ], name = "cars15", description = "first 5 rows of cars", board = "d1" )

pins::pin_versions(name = "cars15",board = "d1")

# version
# 1 6968a7e

# Step 2: restart R session, define a new cache path and pin into S3 cars[2+(1:5), ]
#--------------------------------------------------------------------------------------------
library(pins)

board_register_s3(name = "d1",
                  bucket = "pinsbucket", 
                  key = aws.signature::locate_credentials(profile = "xxxxx")$key, 
                  secret = aws.signature::locate_credentials(profile = "xxxxx")$secret,
                  cache = "~/Desktop/pins",
                  region = "us-east-2" , versions = T)

pins::pin(x = cars[2+(1:5), ], name = "cars15", description = "first 5 rows of cars", board = "d1" )

pins::pin_versions(name = "cars15",board = "d1", cache = "~/Desktop/pins")

# version
# 1 0d034ce

# Step 3: restart R session, re-register as step 1 and try pin_versions and pin_get
#--------------------------------------------------------------------------------------------
library(pins)

board_register_s3(name = "d1",
                  bucket = "pinsbucket", 
                  key = aws.signature::locate_credentials(profile = "xxxxx")$key, 
                  secret = aws.signature::locate_credentials(profile = "xxxxx")$secret,
                  region = "us-east-2" , versions = T )

pins::pin_versions(name = "cars15",board = "d1")
# version
# 1 0d034ce

pins::pin_get(name = "cars15", board = "d1", version = "0d034ce")
# speed dist
# 3     7    4
# 4     7   22
# 5     8   16
# 6     9   10
# 7    10   18

pins::pin_get(name = "cars15", board = "d1", version = "6968a7e")
# Error in board_versions_expand(manifest$versions, version) : 
#   Version '6968a7e' is not valid, please select from pin_versions().

pins::pin_get(name = "cars15", board = "d1", version = "6968a7e", cache = F)
# Error in board_versions_expand(manifest$versions, version) : 
#   Version '6968a7e' is not valid, please select from pin_versions().

# sessionInfo()
# R version 4.1.0 (2021-05-18)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 20.04.2 LTS
# 
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
# 
# locale:
# [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
# [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
# [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] pins_0.4.5
# 
# loaded via a namespace (and not attached):
# [1] httr_1.4.2          compiler_4.1.0      backports_1.2.1     R6_2.5.0            magrittr_2.0.1     
# [6] tools_4.1.0         base64enc_0.1-3     yaml_2.2.1          curl_4.3.2          rappdirs_0.3.3     
# [11] aws.signature_0.6.0 filelock_1.0.2      jsonlite_1.7.2      digest_0.6.27       openssl_1.4.4      
# [16] askpass_1.1
hadley commented 3 years ago

The new board_s3() no longer maintains a data.txt because of exactly this problem.

amashadihossein commented 3 years ago

Ok great! Thanks @hadley! I am looking fwd to updating to the new pins and based on the anticipated changes I am planning to set aside some time for the upgrade to the pkg I have built when the new version is released. Meanwhile, a quick solution seemed to be calling pin_versions prior to any pinning. I have not seen any cases that this temporary solution hasn't worked, but I wanted to confirm based on your understanding of the problem, does this sound right to you? If so, would it be included in legacy_datatxt or the idea of legacy_datatxt is to preserve the current functionality as is? Thanks!

hadley commented 3 years ago

I have no idea, sorry — I'd have to carefully re-read and analyse the existing code to understand why that works, and it's unlikely I'll have the time to do so.

amashadihossein commented 3 years ago

I understand. Thanks!

github-actions[bot] commented 2 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.