openpharma / roxytypes

typed parameter definitions for `roxygen2`
https://openpharma.github.io/roxytypes/
Other
4 stars 1 forks source link

support for multiple types as options #9

Closed danielinteractive closed 1 year ago

danielinteractive commented 1 year ago

It would be great if we could write:

@typed argument: class1, class2
   specifies bla.

and it would be rendered like

@param argument (`class1` or `class2`)\cr specifies bla.

e.g.

dgkf commented 1 year ago

Yeah, I thought about this at one point. I think I'd prefer to have some literal syntax, like if you did:

@typed argument:: literally just this or that
  specifies bal.

becomes

@param argument (literally just this or that)\cr specifies bla.

Otherwise it seems like a design problem that will always have edge cases that need to be handled, and probably in ways that are dependent on the use case or style.

danielinteractive commented 1 year ago

Thanks @dgkf - I thought about this a bit more. I think the literal syntax has the disadvantages of 1) not allowing for the backticks or other consistent formatting of the types, and 2) needs manual standard for mentioning the multiple classes.

Would it help for the design and implementation when we think about a new roclet, say @multityped? I would say that separating the types by commas is pretty failsafe, because R won't accept classes with commas in their names, so there should be no confusion about the delimiters.

So we could have then:

@multityped argument: class1, class2
   specifies bla.

and there could be separate options for multityped rendering, namely multi_type and multi_sep which could again be specified via the DESCRIPTION file:

Config/roxytypes: list(format = "(`{type}`): {description}", multi_type = "`{type}`", multi_sep = "or")
dgkf commented 1 year ago

tldr; in the process of thinking through all the things that could go wrong, I'm pretty set on the literal syntax direction for this


I have a few concerns.

  1. I would prefer not to go down the path of having distinct tags per format Even if it's expressive, it's not discoverable and would alienate new developers with undue complexity.
  2. Constraints quickly get out of control What if you want it to accept classA or classB, but only if classB has a certain non-null field? Or if it also accepts classC, but only if it can be converted into a classA or classB by an as.*() function? Or what if they're relational, where two parameters need to be the same class? Trying to provide syntax for all these cases would be effectively inventing a type signature syntax, which gets complicated fast.
  3. Parsing commas is non-trivial
    Unfortunately, R allows just about anything as a class name, including "class" or this, that. I don't think there's any way to add syntax in a way that doesn't conflict with valid S3 class names (which are just arbitrary strings). What this indicates, though, is that a literal syntax should both allow literal types, but also literally providing its own literal value.

    eg, if "quotes" were used for literals, it should be possible to do @typed arg: '"classA"' to express the actual class name, "classA".

I think anything beyond the simplest case is best left to the package author's discretion. The complexity here gets out of hand very quickly. Personally, I think the literal syntax (maybe using "quotes"?) would be best and would solve even the most complex cases easily.

Using a quoted literal string, it would just look like

@typed arg: "`classA` or `classB`"
  does a thing.

In addition, I think this could be further generalized so that package authors can get as wild and crazy with syntax as they'd like by allowing a format function instead of just a string. Something like:

@typed arg: "classA, classB"
  does a thing.
Config/roxytypes: list(format = function(arg, type, description) {
  # if a type is wrapped in quotes, handle it ourselves and split it into a list
  if (grepl('".*"', type)) type <- strsplit(gsub('^"|"$', "", type), ", ")[[1]]

  # then format it so that its written as types `a`, `b` or `c`
  type_str <- paste0("`", type, "`")
  type_str <- glue::glue_collapse(type_str, sep = ", ", last = " or ")

  paste0("(", type_str, "): ", description) 
})

What I like about both of these solutions is that that they're expressive with minimal new complexity to the existing syntax. To start, I think the literal syntax is the way to go.

danielinteractive commented 1 year ago

Thanks @dgkf for thinking further about it! For me the last solution there looks good, could we maybe add a little simple helper function to the roxytypes package so users don't have to copy this code around? Or could this be in the own package and still work?

dgkf commented 1 year ago

I went to work on this today... and turns out I've already beat myself to it!

Turns out this should already work: https://github.com/openpharma/roxytypes/blob/f95be2e4dda0540218c8a2d00358bf2710e5c099/R/tag_typed.R#L55-L61

And a default where you can see that anything starting with a [, ` or " is considered a literal: https://github.com/openpharma/roxytypes/blob/f95be2e4dda0540218c8a2d00358bf2710e5c099/R/tag_typed.R#L90-L96

But, neither feature is well documented, so I'll work on improving that.