ropensci / neotoma

Programmatic R interface to the Neotoma Paleoecological Database.
https://docs.ropensci.org/neotoma
Other
30 stars 16 forks source link

S3 methods not using arguments consistent with their generic #117

Closed gavinsimpson closed 9 years ago

gavinsimpson commented 10 years ago

When you do R CMD check on the neotoma tarball lots of warnings ar generated along the lines of:

* checking S3 generic/method consistency ... WARNING
get_dataset:
  function(x, ...)
get_dataset.default:
  function(siteid, datasettype, piid, altmin, altmax, loc, gpid,
           taxonids, taxonname, ageold, ageyoung, ageof, subdate)

This is because the S3 methods have been incorrectly set up. An S3 method must have the same arguments as the generic. A method can have additional arguments but it must have the arguments of the generic. In the case above, the get_dataset.default method has first argument siteid where as the generic has argument x. The method is also lacking the ... argument.

The obvious fix is to rewrite the definition of the generic as:

`get_dataset` <- function(siteid, ...)

and the method as

`get_dataset.default` <- function(siteid, datasettype, piid, altmin, altmax, loc, gpid,
           taxonids, taxonname, ageold, ageyoung, ageof, subdate, ...)

This needs doing for all (or most) of the new S3 methods added recently.

SimonGoring commented 10 years ago

Okay, I think that makes sense, and it looks like it's not going to mess things up. I'm in the middle of using Bacon and doing some other stuff, but I'll fix that tonight.

gavinsimpson commented 10 years ago

Yeah, I think if you change the generics to match the first argument of the method then you shouldn't have to change much if anything in the method function body other than to add ... to the arguments.

The only thing to think about is whether you need this method at all. Are you intending a get_dataset.foo() e.g. calling get_dataset(bar) where bar is of class "foo", to do something? In other words, are you envisaging having get_dataset method for objects other than a siteid? If so, siteid would be the wrong choice of argument name and you should probably fix the existing methods to use x or obj as the first argument. It really depends what the intentions were in having these methods/generics.

SimonGoring commented 10 years ago

Yes, the idea is that if you use get_download to get the full data for a site you should be able to use either get_dataset or get_site to pull the information straight out without having to go back up through the API. In the opposite case I wanted to prevent having people use something like sapply to pull the datasetIDs out of the dataset objects. I think this just makes things easier on the end user.

Sorry, so I don't want just siteid to pass in. I want either a siteid to work, or a dataset or download. If either of those go in .then they should invoke the 'special' methods. I guess it doesn't matter what they go in as.

gavinsimpson commented 10 years ago

Well no, it doesn't matter what argument they go in as, but it would make for odd documentation if you have to write:

siteid: either a character string containing the required site ID, or an object of class "download" or "dataset".

It would be easier to just call this first argument x and say that x is an R object containing a character string containing the required site ID, or an object of class "download" or "dataset". Dispatch happens on the first argument and it is a more naturally named argument if it is x.

SimonGoring commented 10 years ago

Okay, that's perfect. That's why I had the ... as the first argument, because I didn't want to have the conflict with the siteid variable. The only issue is that the term siteid is used in the API call, but I can just rename x to siteid in the default method if it's used. Easy peasy.

SimonGoring commented 9 years ago

Okay, I seem to be having some problems here. The issue seems to be that if I use the model:

get_dataset <- function(x, ...)

The S3 warning goes away but I get a \usage warning when I check, that says that the @params for the other parameters (other than x, like datasettype) aren't used. THat's fine, I understand why that is, but if I remove them then the help for get_dataset isn't really useful because then it only shows the x parameter.

Is there a soultion to this? Is it okay to ignore the \usage warning for uploading to CRAN?

gavinsimpson commented 9 years ago

No, you can't ignore it. The problem is that you aren't documenting the S3 methods for the generic in the same Rd file. At the moment you are generating an .Rd file for get_dataset(), in which you are documenting some or all the arguments (@params) used in other methods. But you don't document those methods in that .Rd file.

The way I do this is:

#' function to do foo
#'
#' This function does something foo-like
#'
#' @params x an R object of class \code{"bar"}
#' @params ... arguments passed to other methods
#' @export
#' @rdname foo
foo <- function(x, ...) {
    UseMethod(x)
}

#' @params y some other argument to this method
#' @rdname foo
#' @export
foo.foobar <- function(x, y, ...) {
   ## funky code that does something
}

If the methods use the same arguments as each other, then it will be simpler to document them in the roxygen block for the generic; you don't want to repeat yourself for each method and you probably don't want them all spread out either if the same argument applies to several methods.

SimonGoring commented 9 years ago

Okay, I get this, that's what I've been trying to do, but then when I do:

?get_dataset

I only get the help for the generic, right? I'd rather see the generic help file have all the help that the default does. Does this make sense?

SimonGoring commented 9 years ago

Okay, I think I've figured this out. I just need more complete documentation in my files. Working on it. Thanks @gavinsimpson, sorry for the bother.

SimonGoring commented 9 years ago

Finished documenting and rewriting help sections for roxygen2 blocks in the file. Pushed as commit https://github.com/ropensci/neotoma/commit/11d9a5c659ad9299e4d89a0f6e7f7d81e81bae27, everything should be cleaned up now.

gavinsimpson commented 9 years ago

@SimonGoring Has all this fixing of the S3 methods just broken the paper?

marion <- get_site(sitename = 'Marion Lake%')

fails now because the first argument name is x in neotoma 1.1-0

SimonGoring commented 9 years ago

Ugh, yes. Is it possible to add sitename along with x and have a call without a declared x go to the default method? I can try to figure this out myself, but suggestions would be welcome. The other alternative is to go over to OQ and ask them to make the changes for us.

SimonGoring commented 9 years ago

Okay, I've fixed this in commit https://github.com/ropensci/neotoma/commit/3eeeb1815885dd6fb01726aa35022bb957e9ca21, by sticking sitename back in and then adding some checks around UseMethod. Not sure if this is the best way to do it, but it works and provides a somewhat useful warning.

gavinsimpson commented 9 years ago

Uggh, that's nasty. Why not have sitename mean either:

  1. a character string containing the site name/identifier, or
  2. an object from which the site name(s) can be determined.

Then get rid of x altogether. This will mean you'll need to do a copy-replace on x for sitename and undo some of the changes in 3eeeb1815885dd6fb01726aa35022bb957e9ca21 but it would make for a nicer interface given that we are now in print with the sitename argument.

SimonGoring commented 9 years ago

Yes, I think that's what I've got to do. I tried to fix it, but it's just too gross to mess around with,

gavinsimpson commented 9 years ago

@SimonGoring Do you want me to fix it back along the lines I suggested?

SimonGoring commented 9 years ago

I have the fix done and I'm going to push it (it checks fine locally), but I wanted to tackle the sysdata issue at the same time, and then my windows machine forced me to do a restart. I'm going to finish it off in about a 1/2 hour.

On Wed, Feb 25, 2015 at 11:10 AM, Gavin Simpson notifications@github.com wrote:

@SimonGoring https://github.com/SimonGoring Do you want me to fix it back along the lines I suggested?

— Reply to this email directly or view it on GitHub https://github.com/ropensci/neotoma/issues/117#issuecomment-76004807.

gavinsimpson commented 9 years ago

:cry: One feature/fix at a time :-) Atomise your commits/pushes otherwise you'll keep on having trouble with merge conflicts or other contributors will hold off doing work as they won't know what features of the codebase you may or may not have worked on ;-)

SimonGoring commented 9 years ago

I know. I'm still working on building the habits necessary for good collaborative coding :)

On Wed, Feb 25, 2015 at 11:24 AM, Gavin Simpson notifications@github.com wrote:

[image: :cry:] One feature/fix at a time :-) Atomise your commits/pushes otherwise you'll keep on having trouble with merge conflicts or other contributors will hold off doing work as they won't know what features of the codebase you may or may not have worked on ;-)

— Reply to this email directly or view it on GitHub https://github.com/ropensci/neotoma/issues/117#issuecomment-76009528.

karthik commented 9 years ago

One feature/fix at a time :-) Atomise your commits/pushes otherwise you'll keep on having trouble with merge conflicts or other contributors will hold off doing work as they won't know what features of the codebase you may or may not have worked on ;-)

:+1: :100: Otherwise it's really hard to keep up!

SimonGoring commented 9 years ago

Is there an emoji for hanging your head in shame?

On Wed, Feb 25, 2015 at 12:00 PM, Karthik Ram notifications@github.com wrote:

One feature/fix at a time :-) Atomise your commits/pushes otherwise you'll keep on having trouble with merge conflicts or other contributors will hold off doing work as they won't know what features of the codebase you may or may not have worked on ;-)

[image: :+1:] [image: :100:] Otherwise it's really hard to keep up!

— Reply to this email directly or view it on GitHub https://github.com/ropensci/neotoma/issues/117#issuecomment-76018427.

SimonGoring commented 9 years ago

This is (finally) fixed in commit 32ba77bc2a1d03019e4709697075aa89fe354c20

gavinsimpson commented 9 years ago

@SimonGoring I don't know why R CMD check isn't complaining, but get_site is still wrong. The key bit of information is (taken from Writing R Extensions):

A method must have all the arguments of the generic, including … if the generic does.

You have #L45

get_site <- function(sitename, altmin, altmax, loc, gpid, ...)

but only #57

get_site.default <- function(sitename, ...)

This is in contravention of the above rule. The only surprising thing is that R CMD check isn't screaming Warnings at you.

You either need to have all the methods have at a minimum the following arguments:

get_site.foo <- function(sitename, altmin, altmax, loc, gpid, ...)

or just have the generic be

get_site <- function(sitename, ...)

and put the extra arguments only in the definitions of the methods that need or can work with those argument.

iqis commented 5 years ago

I believe it's a good practice to leave the parameter of a generic function as ...