ropenscilabs / deposits

R Client for access to multiple data repository services
https://docs.ropensci.org/deposits/
Other
37 stars 3 forks source link

Improve update method #78

Closed mpadge closed 1 year ago

mpadge commented 1 year ago

To only update in single direction local -> remote, and to automatically update all metadata and any local files which have changed. That can then deprecate the deposit_update_frictionless() method.

mpadge commented 1 year ago

Those commits get to this point:

library (deposits)
packageVersion ("deposits")
#> [1] '0.1.4.85'
dir.create (file.path (tempdir (), "data"))
path <- file.path (tempdir (), "data", "beaver1.csv")
write.csv (datasets::beaver1, path, row.names = FALSE)
metadata <- list (
    creator = list (list (name = "P. S. Reynolds")),
    created = "1994-01-01T00:00:00",
    title = "Time-series analyses of beaver body temperatures",
    description = "Original source of 'beaver' dataset."
)
cli <- depositsClient$new (service = "zenodo", sandbox = TRUE, metadata = metadata)
cli$deposit_new ()
#> ID of new deposit : 1200401
cli$deposit_upload_file (path)
#> Loading required namespace: frictionless
#> frictionless metadata file has been generated as '/tmp/RtmpzjB7aM/data/beaver1.csv'
#> Please make sure you have the right to access data from this Data Package for your intended use.
#> Follow applicable norms or requirements to credit the dataset and its authors.

cli$metadata$title
#> [1] "Time-series analyses of beaver body temperatures"
cli$hostdata$title
#> [1] "Time-series analyses of beaver body temperatures"
cli$metadata$description
#> [1] "Original source of 'beaver' dataset."
cli$hostdata$metadata$description
#> [1] "Original source of 'beaver' dataset."

f <- file.path (dirname (path), "datapackage.json")
x <- readLines (f)
i <- grep ("Original\\ssource", x)
x [i] <- gsub ("Original\\ssource", "A description", x [i])
i <- grep ("Time\\-series\\sanalyses", x)
x [i] <- gsub (
    "Time\\-series\\sanalyses\\sof\\sbeaver\\sbody\\stemperatures",
    "New Title",
    x [i]
)
writeLines (x, f)
cli$deposit_update (path = dirname (path))
#> Local file at [/tmp/RtmpzjB7aM/data/beaver1.csv] is identical on host and will not be uploaded.
#> Local file at [/tmp/RtmpzjB7aM/data/datapackage.json] has changed and will now be uploaded.

cli$metadata$title
#> [1] "New Title"
cli$hostdata$title
#> [1] "New Title"
cli$metadata$description
#> [1] "A description of 'beaver' dataset."
cli$hostdata$metadata$description
#> [1] "A description of 'beaver' dataset."

Created on 2023-05-09 with reprex v2.0.2

mpadge commented 1 year ago

Those commits now give this behaviour in response to changing metadata via client only, and not via local "datapackage.json" file:

library (deposits)
packageVersion ("deposits")
#> [1] '0.1.4.88'
dir.create (file.path (tempdir (), "data"))
path <- file.path (tempdir (), "data", "beaver1.csv")
write.csv (datasets::beaver1, path, row.names = FALSE)
metadata <- list (
    creator = list (list (name = "P. S. Reynolds")),
    created = "1994-01-01T00:00:00",
    title = "Time-series analyses of beaver body temperatures",
    description = "Original source of 'beaver' dataset."
)
cli <- depositsClient$new (service = "zenodo", sandbox = TRUE, metadata = metadata)
cli$deposit_new ()
#> ID of new deposit : 1200421
cli$deposit_upload_file (path)
#> Loading required namespace: frictionless
#> frictionless metadata file has been generated as '/tmp/RtmpEcdBGo/data/beaver1.csv'
#> Please make sure you have the right to access data from this Data Package for your intended use.
#> Follow applicable norms or requirements to credit the dataset and its authors.

cli$metadata$description
#> [1] "Original source of 'beaver' dataset."
cli$hostdata$metadata$description
#> [1] "Original source of 'beaver' dataset."

cli$metadata$description <- "A description"
cli$deposit_update (path = path)
#> Local file at [/tmp/RtmpEcdBGo/data/beaver1.csv] is identical on host and will not be uploaded.
#> Warning: Metadata in client differs from values in '/tmp/RtmpEcdBGo/data/datapackage.json'
#> Please update that file and call
#> > cli$deposit_update(path = /tmp/RtmpEcdBGo/data/datapackage.json)

cli$metadata$description
#> [1] "A description"
cli$hostdata$metadata$description
#> [1] "A description"

Created on 2023-05-09 with reprex v2.0.2

So it all works, but generates a warning about needing to update "datapackage.json".

mpadge commented 1 year ago

Re-opening to remove the deposit_update_frictionless method

mpadge commented 1 year ago

Re-opening to get the update method to work before remote deposit has been initiated