Closed krivit closed 3 years ago
I love love love this idea. Slight preference for sctrl
as every keyboard uses the ctrl
abbreviation on the control key.
I don't get the details of the implementation, but I sure love the proposed functionality.
Apologies if I'm grouchy and please correct me if I'm missing something, but if we have the same function, say sctrl()
or statnet_opts()
(my candidate) defined and exported by multiple packages we will end-up with namespace conflicts and always only one will take precedence -- the one from the package that was attached last?
Hello, Michal. You are correct about how R treats conflicts (no need to apologize!), as far as I know. Unless there is some way to alter this behavior within RStudio, the idea is that ALL of the sctrl / statnet_opts functions will take ALL arguments of ALL statnet packages. This would presumably involve some type of customized roxygen-like preprocessing of all packages, unless there's a different / better way.
One possibility: Only put the sctrl / statnet_opts function in the statnet package. Each individual package would keep its own control function, which we can preserve for backwards compatibility, but then the statnet package's single control function would be updated (automatically, one hopes) any time any of the component packages is updated.
None of this should be necessary. The only thing sctrl()
should do is take its arguments (both the ones it names explicitly and the ...
, stuff them in a list, and return the list. It should be something very close to
sctrl <- function(MCMC.arg1=13, MCMC.arg2="blah", ...){
control <- list(...)
formal.args<-formals(sys.function())
formal.args[["..."]]<-NULL
for(arg in names(formal.args))
control[arg]<-list(get(arg))
control
}
with different packages having different argument lists but the same bodies. That means that no matter which package's sctrl()
actually gets called, all arguments end up in the list.
But, hopefully, RStudio's, ESS's, and other editors' tab completion mechanisms would "see" all of the sctrl()
functions defined and would pool all of the packages' arguments together as completion options.
Edit: The logic of sorting out which argument goes where is a part of #13.
I took a look at RStudio source code for code completion and as far as I can tell the mechanism is a collection of
options()
, knitr::opts_*$get()
, and selected "dplyr" verbs (e.g. group_by()
will hint column names from a data frame that it receives, even if via %>%
.I believe we can achieve the completion for statnet control options if we leverage one of those built-in mechanisms.
Some ideas of varying degree of speculativity:
sctrl()
defined in "statnet.common" that has an argument for every option used throughout the whole Statnet suite.
statnet_opts
. Every statnet package will Import "statnet.common" and when loaded (.onLoad
) will add its options to that environment via sth like assign("optionName", optionValue, envir = statnet.common::statnet_opts)
. The options can be queried with statnet.common::statnet_opts$optionName
.
c()
ombines those lists (this could perhaps be cached)$
But, hopefully, RStudio's, ESS's, and other editors' tab completion mechanisms would "see" all of the
sctrl()
functions defined and would pool all of the packages' arguments together as completion options.
I don't think this will work. If I attach two packages defining functions with identical names RStudio hints the arguments only for the function from the package attached last. I tried with igraph::get.vertex.attribute()
and network::get.vertex.attribute()
.
WRT Pavel's "hopefully" comment above: I played around with this earlier today and unfortunately it appears that RStudio for one only "sees" the arguments of the most recently loaded version of a given function.
Michal's suggestions 2 and 3 are intriguing, and I also think the simplistic #1 is OK. WRT different defaults, one aspect of the conversation we were having last week is that each package's arguments would be unique (e.g., ending with "ego" if they're for ergm.ego) to avoid such collisions.
Indeed, I've also done some tests, and while base R and ESS pool arguments, RStudio is too smart for its own good.
too smart
well, R in general will default to the most recent package when there are function name clashes.
The relevant RStudio code is at https://github.com/rstudio/rstudio/blob/master/src/cpp/session/modules/SessionRCompletions.R
Interestingly all those functions are exposed in RStudio. Call ls(pos = "tools:rstudio", all=TRUE)
. I wonder if we can use them to enhance the completion .onAttach
ing statnet packages. I don't know of any package that would do that.
Edit: ... and of course it would not help ESS, Rgui nor Vim users.
In principle, it's possible to replace any function in any package after loading by unlocking the binding. (See, for example, https://github.com/krivit/msgtime , which will never make it to CRAN because it overwrites functions in the base
package.) So, here is another approach:
statnet.common
, define and export three functions: sctrl()
(and I think it should be kept as short as possible) and e.g., add_sctrl(args, pkg)
.sctl()
's body is to just take all the arguments (including the ...
s) and stuff them into a list, as in https://github.com/statnet/statnet.common/issues/14#issuecomment-754273862 .add_sctrl()
unlocks the sctl()
's binding, updates its arguments and default list with its inputs (via formals() <-
), then locks the binding. It also remembers which pkg
added those arguments.setHook()
is used to remove arguments from packages when unloaded (see https://github.com/statnet/statnet.common/blob/ce32a0f3f2f0e113902653cbd0bb97e420432771/R/locator.R#L28-L29 , for example).ergm()
) then uses its .onLoad()
to call e.g., add_sctrl(alist(MCMC....), "ergm")
to update sctrl()
.I don't know whether that will suffice for RStudio. It should be straightforward to check.
Edit: Also, this approach should not run afoul of CRAN, since it's only statnet.common
altering itself.
setHook()
is used to remove arguments from packages when unloaded (see https://github.com/statnet/statnet.common/blob/ce32a0f3f2f0e113902653cbd0bb97e420432771/R/locator.R#L28-L29 , for example).
Oooh this is nice with setHook()
!
Did a version of "statnet.common" with these make to CRAN?
Edit: Also, this approach should not run afoul of CRAN, since it's only
statnet.common
altering itself.
Good point.
In general this is not far from my speculation no. 3.
Did a version of "statnet.common" with these make to CRAN?
Yes, and before that, the code was in ergm
, which also did.
That looks definitely interesting and convincing, at least on paper. I wonder if RStudio will catch-up with the updates to sctl()
arguments and whether CRAN checks will scream at unlocking namespaces.
OK, the following seems to update completion in RStudio, ESS, and R CLI for Linux:
sctrl(A1=1, A2=2, ...)
A3=3
to it:
library(pkgA)
# Grab the package environment. IMPORTANT: Modifying the package
# namespace will *not* update the function.
pkgAenv <- as.environment("package:pkgA")
sctrl() # A1 and A2 arguments (and ...).
tmp <- sctrl formals(tmp) <- c(formals(sctrl),alist(A3=3))
unlockBinding("sctrl", pkgAenv) assign("sctrl",tmp, pkgAenv) lockBinding("sctrl", pkgAenv)
sctrl()
<sup>Created on 2021-01-05 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>
Right. So far so good. One thing I'm not sure yet is whether the way sctrl()
gets updated above will be visible by other packages that import it. If I recall correctly importing makes essentially a copy of the exported function in the namespace of the importing package, and this happens onLoad
. If it is so, dependencies of "statnet.common" cannot import the function (because the copy may get outdated) and always call it as statnet.common::sctrl()
.
Yet another possibility that we did not scrutinize (apologies if you did elsewhere) is to just use options()
and a flat option names prefixed with package name, e.g. ergm_init.method
and so on (I used _
to separate package name from option name). Then we get all the advantages of flat option lists and code completion. What needs addressing is storing and getting the defaults.
Standard way of doing that is when an option is needed one uses getOption("optionName", default_value)
but it is cumbersome because default_value
needs to be hardcoded in every place the option is needed. That's not DRY.
One solution is that a package sets options via onLoad
or onAttach
. I'm not a big fan of it (which I believe we discussed in the context of "egor").
Second one is to have a list of defaults within every package and a function get_statnet_option()
something like:
# In 'ergm' namespace (so ergm::option_defaults)
ergm.option_defaults <- list(
main.method = "MCMLE",
more.dots.in.option.name = "v"
)
# In 'ergm.ego' namespace (so ergm.ego::option_defaults)
ergm.ego.option_defaults <- list(
main.method = "Stepping"
)
get_statnet_option <- function(optname) {
o <- getOption(optname)
if(is.null(o)) {
pkg <- regmatches(optname, regexec("^[^_]+(?=\\_)", optname, perl=TRUE))
defname <- sub("^[^_]+_(.*)$", "\\1", optname, perl=TRUE)
# For the sake of example:
get(paste0(pkg, ".option_defaults"))[[defname]]
# ... but should be:
# getFromNamespace(pkg, "option_defaults")[[defname]]
} else {
return(o)
}
}
# Need a good naming pattern for options as we have dots in package names...
# Above `_` separates package name from option name.
# Usage:
get_statnet_option("ergm_main.method")
#> [1] "MCMLE"
get_statnet_option("ergm.ego_main.method")
#> [1] "Stepping"
# User sets an option
op <- options(ergm_main.method = "fancy method")
get_statnet_option("ergm_main.method")
#> [1] "fancy method"
options(op)
Edit (pressed Update to early): The downside is that we do not have exclusive control of option names, but with package name prefixes clashes are rather unlikely.
It looks like it can be worked around by a line in .onLoad()
that overwrites the local sctrl
with whatever is in statnet.common::sctrl
at that moment. I've set up 3 packages, pkgA_1.0.tar.gz, pkgB_1.0.tar.gz, and pkgC_1.0.tar.gz, with B and C depending on A but not on each other.
Now,
library(pkgB)
sctrl()
#> $A1
#> [1] 1
#>
#> $A2
#> [1] 2
#>
#> $B1
#> [1] 3
#>
#> $B2
#> [1] 4
library(pkgC)
#>
#> Attaching package: 'pkgC'
#> The following object is masked from 'package:pkgB':
#>
#> sctrl
sctrl()
#> $A1
#> [1] 1
#>
#> $A2
#> [1] 2
#>
#> $B1
#> [1] 3
#>
#> $B2
#> [1] 4
#>
#> $C1
#> [1] 5
#>
#> $C2
#> [1] 6
Created on 2021-01-05 by the reprex package (v0.3.0)
So far so good. However,
library(pkgB)
loadNamespace("pkgC")
#> <environment: namespace:pkgC>
sctrl()
#> $A1
#> [1] 1
#>
#> $A2
#> [1] 2
#>
#> $B1
#> [1] 3
#>
#> $B2
#> [1] 4
Created on 2021-01-05 by the reprex package (v0.3.0)
So, just loading the namespace without masking wouldn't do the trick.
I guess the solution might be to have statnet.common
updating function to signal back to the clients that they need to resynchronise their copies of sctrl()
. I guess add_sctrl()
could have another argument for a callback function to do that and also export a function that makes the process simple.
OK, with the other two solved (I think), I can take a crack at this.
OK, I think I've got it. Same deal as before: pkgA_1.0.tar.gz, pkgB_1.0.tar.gz, pkgC_1.0.tar.gz.
I am able to complete under ESS, R CLI, and Rstudio. unloadNamespace()
-ing the package might not be sufficient to remove its arguments from tab completion, however.
It should also be possible to write a simple utility that takes a bunch of control functions and constructs a list of their arguments for passing to update_sctrl()
.
@krivit I'm trying these test packages out and can't seem to get it to work. Which one is supposed to mimic statnet.common?
FYI I've created https://github.com/statnet/statnet-options with @krivit's test packages to play around with the solutions.
Using the repo above I'm seeing:
source("install.R", echo=TRUE)
#>
#> > install.packages("pkgA", lib="lib", repos = NULL, type = "source")
#>
#> > install.packages("pkgB", lib="lib", repos = NULL, type = "source")
#>
#> > install.packages("pkgC", lib="lib", repos = NULL, type = "source")
withr::with_libpaths(
"lib",
{
library(pkgB)
sctrl()
}
)
#> list()
This is the correct behaviour. stctrl()
should return an empty list. Take a look at the named arguments of the function (via args(stctrl)
) or type stctrl(
into the R interface of your choice and see what arguments it suggests.
From https://github.com/statnet/statnet.common/issues/14#issuecomment-754328252 I was expecting the list of default values (?)
That was an earlier prototype. The current one only assigns non-missing arguments.
And is this the intention? I thought we want a list of all the options.
We want a list of arguments that eventually get gathered into a control.*()
hierarchy (implemented in #13) and then evaluated as a call (implemented in #12).
So sctrl()
will be a layer on top of control.*()
, not a replacement?
Yes. It would be used in place of list()
to provide control arguments.
Well, the good news is that the functionality is there. The bad news is that it raises a whole bunch of check warnings, only some of which are avoidable. I think it might be worth e-mailing R package development list to check with CRAN that it's kosher.
Try it out in the sctrl_test
branch in public ergm
.
That created some unexpected R CMD check
warnings in ergm
:
* checking R code for possible problems ... NOTE
sctrl: no visible global function definition for ‘control.san’
sctrl : <anonymous>: no visible global function definition for
‘network.size’
sctrl: no visible global function definition for ‘control.logLik.ergm’
Undefined global functions or variables:
control.logLik.ergm control.san network.size
[...]
* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'ergm-reexports'
‘drop’ ‘init’ ‘init.method’ ‘main.method’ ‘force.main’ ‘main.hessian’
‘checkpoint’ ‘resume’ ‘MPLE.max.dyad.types’ ‘MPLE.samplesize’
[...]
‘SAN.prop.weights’ ‘SAN.prop.args’ ‘SAN.packagenames’
‘SAN.ignore.finite.offsets’ ‘myname’ ‘clinfo’
The first problem is that the default arguments involve function calls that are visible to ergm
but not visible to statnet.common
. I think this can be fixed by either not storing default arguments (which some completion tools might be able to use) or pruning those that are not atomic.
The second problem is that when ergm
is loaded, sctrl()
does in fact have those arguments, so the check expects them to be documented. One option is to just dump them into the help file without description and tell the user to look up the corresponding control functions. In principle, we could even do something like organise them by which control functions they belong to, but I am afraid that someone (i.e., not me) will have to edit that manually.
I'm wondering whether option (1) from https://github.com/statnet/statnet.common/issues/14#issuecomment-754273962 would actually attain the same goals and at the same time be much simpler and R CMD check
-compliant out of the box. I have a feeling that this sctrl()
solution involves a lot of hacking and buys us very little.
I don't like the idea of having to update statnet.common
every time that any package that uses the control.*()
UI is updated.
I don't have a problem with that update. It's what all the packages depend on...
Updated. It looks like:
sctrl()
arguments to documentation is forward, since R CMD check
tells you exactly what they are, at which point it's a matter of copy-paste and search-and-replace.I think the current setup (in statnet.common
and ergm@sctrl_test
) is workable. Can y'all test it and tell me if everything works as intended?
Now implemented in statnet.common
, ergm@sctrl_test
, and ergm.ego@sctrl_test
. Could someone please test them out?
@sgoodreau , @martinamorris , @mbojan , before I merge these changes into ergm
and ergm.ego
, and apply them to other packages, I would like to confirm that they work for at least one person other than myself.
Installed and tested. My observations involve a bit of role-playing an unsuspecting user.
Loaded ergm
:
sctrl()
is re-exported from statnet.common
and there is no true documentation of sctrl()
in ergm
)drop
argument is. So I call ?sctrl
which opens the man page of re-exported functions (no information about the arguments). In the bottom of the page we write that sctrl()
is in fact statnet.common::sctrl()
. So I ask ?statnet.common::sctrl
which opens the man page entitled "... utility to facilitate argument completion ...". Still no information about the argument documentation nor any link where it is to be found.Restarted R, loaded statnet.common
:
...
as it is the only argument of sctrl()
R knows about.In summary: I think sctrl()
implements a clever idea. I am however not exactly convinced that it buys us (and the users) the convenience proportional to the added complexity, counter-intuitiveness and the problems above. In a more standard approach a single statnet.common::sctrl()
would be defined with all the arguments ever needed by any Statnet package using the control options. It will have to be updated if new control options are needed to any of the "child" packages.
@mbojan , thanks for testing it. There might be a way to make ?sctrl
more informative, by rendering the pages dynamically, with the https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Dynamic-pages API. That is, R code can be executed and results inserted into the Rd file at rendering time. The package generics
uses that to provide a real-time list of all methods implemented for a particular generic.
Similarly, since we are already tracking which package introduced which arguments, we can produce a real-time "mapping" of arguments to packages (and, with a slight API change, to control functions).
@mbojan , here you go. Now, ?sctrl
has a dynamically updated list of parameters indexed by their packages and functions. If you only load statnet.common
, then there won't be any parameters, since there aren't any control functions loaded.
It does not seem to work for me @krivit . I installed both ergm
and statnet.common
from the HEAD
s, library()
ied ergm
and on ?sctrl
the "Currently loaded" section is empty. Moreover, after library(statnet.common)
things did not change and I'm getting:
sctrl(drop=FALSE)
## Error in .Primitive("missing")(NULL) : invalid use of 'missing'
BTW I noticed that ergm
stopped working if not attached (perhaps a separate issue?):
library(network)
#> network: Classes for Relational Data
#> Version 1.16.1 created on 2020-10-06.
#> copyright (c) 2005, Carter T. Butts, University of California-Irvine
#> Mark S. Handcock, University of California -- Los Angeles
#> David R. Hunter, Penn State University
#> Martina Morris, University of Washington
#> Skye Bender-deMoll, University of Washington
#> For citation information, type citation("network").
#> Type help("network-package") to get started.
data(faux.magnolia.high, package="ergm")
fit <- ergm::ergm(faux.magnolia.high ~ edges)
#> Error in get(myctrlname, mode = "function"): object 'control.::' of mode 'function' was not found
@mbojan , thanks for trying it out!
.Primitive("missing")
thing has been fixed.sctrl()
before loading packages such as ergm
or ergm.ego
, which export control.*()
functions that let sctrl()
populate both its own arglist and its help. In particular, testing it against network
won't do anything. I've added a warning for unrecognised arguments.sctrl()
-compatible code in ergm
and ergm.ego
is in the sctrl_test
branches.Can you please try the following?
install_github("statnet/statnet.common@master")
install_github("statnet/ergm@sctrl_test")
install_github("statnet/ergm.ego@sctrl_test")
data(faux.magnolia.high, package="ergm")
fit <- ergm::ergm(faux.magnolia.high ~ edges)
library(ergm)
help("sctrl",package="statnet.common") # reexport help has not yet been updated.
library(ergm.ego)
help("sctrl",package="statnet.common") # more args
Yep all works now :rocket:
@martinamorris , @sgoodreau , @drh20drh20 , can I get one more confirmation that this works before I merge?
Thanks for the step-by-step instructions in https://github.com/statnet/statnet.common/issues/14#issuecomment-778678558. Without them, this thread has become so esoteric that I would not have been able to provide the requested confirmation.
The code in https://github.com/statnet/statnet.common/issues/14#issuecomment-778678558 works as advertised. In addition, I get super-helpful completion suggestions in RStudio when, for instance, I load both ergm
and ergm.ego
and then begin a line with ergm(control=contro
then hit the tab key. Very nice work!
One caveat: I do not find this description
in the sctrl
documentation helpful, and I doubt many novice users would either:
In and of itself, ‘sctrl’ copies its named arguments into a list.
However, its argument list is updated dynamically as packages are
loaded, as are those of its reexports from other packages. This is
done using an API provided by helper functions. (See ‘API?sctrl’.)
Would it instead be possible to explain the big picture, i.e., what sctrl is trying to do, and (if possible in plain language) how?
@drh20drh20 , absolutely. Someone should certainly do that. Any volunteers? ;-)
Before I merge the changes into the other packages and close this ticket, I want to make absolutely sure that sctrl()
is the name we want, because once we release, we are going to be stuck with it. The requirements:
sctrl
sounds ugly to me; can't say why.)Some other candidates:
sctrl()
: Statnet ConTRoLsctl()
: Statnet ConTroLsnctrl()
: StatNet ConTRoLstctrl()
: STatnet ConTRoLcontrol()
scontrol()
: Statnet CONTROLstcontrol()
: STatnet CONTROLsncontrol()
: StatNet CONTROLscl()
: Statnet ControL or Statnet Control Liststcl()
: STatnet ControL or STatnet Control Listsncl()
: StatNet ControL or StatNet Control Listsnc()
: StatNet Controlsnlist()
: StatNet List
Edit: Updated some mnemonics.
Edit 2: Added snlist
.
@mbojan , @martinamorris , @sgoodreau , @drh20drh20 , @chad-klumb , @CarterButts , @handcock
for brevity, i still like 1
in order of alternatives:
8 (most like "control", sn for statnet) 3 (brevity, sn for statnet)
i'm curious about 5 (control
). obviously nice in terms of simplicity. does this make it somehow an "S3" method (good)? is that possible? or are we masking some other base R functionality (bad)?
One valuable capability that was lost with implementation of #12---that
list()
can be used instead ofcontrol.*()
---is tab completion for the arguments of thecontrol.*()
functions. This will be even worse if #13 is implemented.The proposal is to work around this is as follows:
sctl()
(for Statnet ConTroL).sctl()
that takes all of the arguments of all of thecontrol.*()
functions defined in the package.sctl(
", they'll get a list of everything any of the current functions will accept.Any preferences and suggestions as to the function name?