DataONEorg / rdataone

R package for reading and writing data at DataONE data repositories
http://doi.org/10.5063/F1M61H5X
36 stars 19 forks source link

Added `as` argument to getObject #214

Closed isteves closed 6 years ago

isteves commented 6 years ago

This to allows users to specify what to pass to httr::content.

Now to get a csv directly into RStudio, you can do this:

cn <- dataone::CNode("PROD")
mn <- dataone::getMNode(cn, "urn:node:ARCTIC")
x <- getObject(mn, "urn:uuid:bf33c6ba-16e6-40de-8f87-1b78ecefbae8", as = "parsed")

@gothub

gothub commented 6 years ago

@isteves when we discussed this PR, we saw that specifying as="parsed" for a NetCDF file caused an error, as the mime package that httr::content calls has a different MIME-TYPE for NetCDF that some DataONE objects can have. This could happen for other data files, which is a problem because with the current method (before these changes), raw is always specified and will always work. Seem like we need to have some error handling in here, so that if that error happens we will fall back to raw. Also, how does the arg as interact with the arg path that was checked in during a previous PR (and is now a conflict in the arg list, which I can fix during the PR accept)?

mbjones commented 6 years ago

Great suggestion. This PR has similar issues in changing the MN API as PR #208. See the discussion there for details. Can we find a way to accomplish your goals but also stay true to the definition of getObject in the DataONE API?

isteves commented 6 years ago

@mbjones, I chatted with @gothub about it a bit in person and read up on the issue/Slack discussion. Since the as argument is just passed to httr::content(), what do you think of implementing it with ...? Thus, getObject would still mirror the API but you could still add in the additional argument.

In case it's unclear, this is what I mean by the use of ...: (In our case, the function would be getObject and the extra argument would be as)

sum_plus_one <- function(variables, ...) {
    sum(variables, na.rm = FALSE, ...) + 1
}

sum_plus_one(c(1:10, NA), na.rm = TRUE)
isteves commented 6 years ago

Update: I can't actually do what I suggested because of a "multiple actual arguments" error that comes up. Apparently, na.rm is a special case that doesn't throw that error...

remove_a <- function(variables, ...) {
  gsub(pattern = "a", replacement = "", variables, ...)
}
remove_a("arctic")
remove_a("arctic", replacement = "t")

Still thinking about other ways of implementing this/writeDisk... I understand why the default output should match the API exactly, but I don't fully understand why extra "convenience" arguments can't be added. An alternative to as could be to write a parse function based on httr:::parse_auto, which could then be used with a DataObject. Not sure what the appropriate place for that would be (D1Client.R? datapack? datamgmt?)