Closed markean closed 1 year ago
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help
for help.
:rocket:
Editor check started
:wave:
git hash: 1b9bbebd
Package License: GPL (>= 2)
srr
package)This package is in the following category:
:heavy_check_mark: All applicable standards [v0.1.0] have been documented in this package (86 complied with; 30 N/A standards)
Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report()
function from within a local clone of the repository.
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:--------------|------:|
|internal |base | 146|
|internal |melt | 66|
|imports |methods | 83|
|imports |stats | 66|
|imports |graphics | 8|
|imports |Rcpp | NA|
|suggests |covr | NA|
|suggests |ggplot2 | NA|
|suggests |knitr | NA|
|suggests |microbenchmark | NA|
|suggests |rmarkdown | NA|
|suggests |spelling | NA|
|suggests |testthat | NA|
|suggests |withr | NA|
|linking_to |BH | NA|
|linking_to |dqrng | NA|
|linking_to |Rcpp | NA|
|linking_to |RcppEigen | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (15), q (14), attr (13), warning (13), numeric (12), c (7), nrow (7), integer (6), length (5), print (5), as.integer (4), match.call (4), names (4), ncol (4), tryCatch (4), do.call (3), if (3), mean (3), as.vector (2), cbind (2), logical (2), match (2), rownames (2), as.matrix (1), colMeans (1), is.null (1), matrix (1), NROW (1), rbind (1), sqrt (1), sum (1), t (1), table (1)
setGeneric (36), setMethod (36), setClass (11)
error (13), el_control (8), validate_optim (4), calibrate (3), compute_EL (3), validate_weights (3), compute_generic_EL (2), get_max_threads (2), test_GLM (2), test_LM (2), validate_family (2), compute_bootstrap_calibration (1), compute_confidence_intervals (1), compute_confidence_region (1), compute_ELD (1), el_eval (1), el_glm (1), el_lm (1), el_mean (1), el_sd (1), get_rank (1), test_hypothesis (1), test_multiple_hypotheses (1), validate_alpha (1), validate_b (1), validate_calibrate (1), validate_cv (1), validate_hypotheses (1), validate_hypothesis (1), validate_keep_data (1), validate_level (1), validate_lhs (1), validate_x (1)
formula (14), df (8), offset (6), optim (6), na.action (5), weights (5), family (4), pchisq (3), step (3), contrasts (2), model.response (2), model.weights (2), sd (2), glm.fit (1), pf (1), qchisq (1), qf (1)
par (8)
base
methods
melt
stats
graphics
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in C++ (53% in 15 files) and R (47% in 29 files) - 1 authors - 2 vignettes - 1 internal data file - 4 imported packages - 59 exported functions (median 4 lines of code) - 120 non-exported functions in R (median 7 lines of code) - 91 R functions (median 14 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 29| 88.8| | |files_src | 15| 95.4| | |files_vignettes | 2| 85.7| | |files_tests | 17| 95.3| | |loc_R | 1629| 80.0| | |loc_src | 1827| 74.0| | |loc_vignettes | 129| 33.7| | |loc_tests | 1243| 89.7| | |num_vignettes | 2| 89.2| | |data_size_total | 1311| 61.2| | |data_size_median | 1311| 65.8| | |n_fns_r | 179| 88.1| | |n_fns_r_exported | 59| 90.2| | |n_fns_r_not_exported | 120| 87.0| | |n_fns_src | 91| 78.1| | |n_fns_per_file_r | 3| 55.1| | |n_fns_per_file_src | 6| 55.6| | |num_params_per_fn | 2| 11.9| | |loc_per_fn_r | 7| 16.0| | |loc_per_fn_r_exp | 4| 4.3|TRUE | |loc_per_fn_r_not_exp | 7| 18.0| | |loc_per_fn_src | 14| 46.6| | |rel_whitespace_R | 8| 59.8| | |rel_whitespace_src | 5| 49.5| | |rel_whitespace_vignettes | 8| 6.5| | |rel_whitespace_tests | 8| 72.2| | |doclines_per_fn_exp | 34| 40.2| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 253| 90.8| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![check-standard.yaml](https://github.com/markean/melt/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/markean/melt/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 2673111985|pages build and deployment |success |6ac77c | 46|2022-07-14 | | 2673079345|pkgcheck |failure |1b9bbe | 51|2022-07-14 | | 2673079344|pkgdown |success |1b9bbe | 49|2022-07-14 | | 2673079342|R-CMD-check |success |1b9bbe | 71|2022-07-14 | | 2673079354|test-coverage |success |1b9bbe | 153|2022-07-14 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 77.8Mb sub-directories of 1Mb or more: libs 77.2Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.78 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following function have cyclocomplexity >= 15: function | cyclocomplexity --- | --- el_glm | 18 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 5 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 4 Lines should not be more than 80 characters. | 1
|package |version | |:--------|:---------| |pkgstats |0.1.1.3 | |pkgcheck |0.0.3.76 | |srr |0.0.1.167 |
This package is in top shape and may be passed on to a handling editor
Hi @markean - I just wanted to provide a brief update here. I apologize for the delay in next steps here; it's not for lack of excitement to kick off this review! I hope to have a handling editor ready to assign next week.
Thanks for the update @emilyriederer!
@ropensci-review-bot assign @Paula-Moraga as editor
Assigned! @Paula-Moraga is now the editor
@markean , I'm delighted to share that @Paula-Moraga will serve as editor for this package. She is away this week but is available to commence next week. We will be in more touch soon!
@emilyriederer, thanks for the news!
Hi @markean. I am pleased to handle this package. I will look for reviewers.
@ropensci-review-bot seeking reviewers
I'm sorry @Paula-Moraga, I'm afraid I can't do that. That's something only editors are allowed to do.
Hi @Paula-Moraga. Thank you for your time and effort. Please let me know if there is anything I can help in the review process. @emilyriederer Could you update the label?
Done! I'll ask the bot team on Slack if they know why @Paula-Moraga couldn't herself since she is, in fact, the editor!
@ropensci-review-bot assign @pchausse as reviewer
@pchausse added to the reviewers list. Review due date is 2022-08-25. Thanks @pchausse for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@pchausse: If you haven't done so, please fill this form for us to update our reviewers records.
@ropensci-review-bot set due date for @pchausse to 2022-09-05
Review due date for @pchausse is now 05-September-2022
Hi @pchausse. Thank you so much for reviewing our package. Please let me know if you have any question. A draft version of pdf vignette in vignettes
folder might be of help.
@ropensci-review-bot assign @awstringer1 as reviewer
@awstringer1 added to the reviewers list. Review due date is 2022-08-29. Thanks @awstringer1 for accepting to review! Please refer to our reviewer guide.
rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.
@awstringer1: If you haven't done so, please fill this form for us to update our reviewers records.
Hi @awstringer1! Thank you for reviewing the package. Please let me know if you have any question. A draft version of pdf vignette in vignettes
folder might be of help.
@pchausse and @awstringer1: By an external request, a newer version of the package will be released to CRAN in August with some minor updates. It will not interfere with the review process, but I will tag it after the release. Then the reviewer comments will be reflected to the package in a subsequent release.
I am not sure if this is something the authors did, or something the rOpenSci
template does, but the code line references in the "srr report" do not seem to match up with the locations of the tags in the roxygen
documentation. They are a few lines off. This is fine mostly but sometimes I cannot find the relevant tag, e.g. for "Standards in on line#306 of file R/AllGenerics.R:" Can the authors comment? Thank you.
@awstringer1 That's because the repo has been updated since the report linked to above was generated. As stated in the automated checks, you can - and in this case should - generate a local, and up-to-date, version of the report by running srr::srr_report(path)
, where path
is where your local clone of melt
is. By default that will open the full report in your web browser, in which all reported line numbers should match current state of repo.
Result of running srr_report
on a fresh install of the development version of melt
on my machine:
> srr::srr_report(file.path(.libPaths(),'melt'))
ℹ Loading melt
Writing 'NAMESPACE'
Writing 'NAMESPACE'
Loading required namespace: rmarkdown
Error in flist[sapply(blocks, inherits, "try-error")] :
invalid subscript type 'list'
In addition: Warning messages:
1: Failed to load at least one DLL: must specify DLL via a “DLLInfo” object. See getLoadedDLLs()
2: In setup_ns_exports(path, export_all, export_imports) :
Objects listed as exports, but not present in namespace: el_control, el_eval, el_glm, el_lm, el_mean, el_sd
I got a different error when trying on the CRAN
version. Session information:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.5
@awstringer1 The path
should be your local clone of the package, not the installed path. I can generate the report with srr::srr_report(path)
after cloning. I guess that is what @mpadge meant above.
@pchausse and @awstringer1: v1.7.0 has been released to CRAN. No changes were made with respect to the srr
tags. The code on the repository will be updated after reviewers’ comments. Thank you.
Here is my review, I hope it is helpful!
The following standards currently deemed non-applicable (through tags of @srrstatsNA
) could potentially be applied to future versions of this software: (Please specify)
Please also comment on any standards which you consider either particularly well, or insufficiently, documented.
For packages aiming for silver or gold badges:
The package seems to have been developed for future general use in mind. The authors have provided an interface that is clearly intended to be used beyond the examples considered in their associated preprint. A large number of the standards described in the rOpenSci
review materials seem to have been considered by the authors when designing their software.
Overall, the authors appear to have convincingly motivated that the package meets the rOpenSci
standards. Many statements in the authors' @srrstats
documentation tags contain incomplete references to the package documentation, which precludes a more thorough review. See below for a non-exhaustive list of examples. I have the following comments about the authors' justifications:
"@srrstats {RE1.4} Some of the core distributional assumptions are documented.": this is potentially confusing to users. Either all of the regularity conditions should be clearly documented, or a link to a specific section in a paper where these conditions are clearly documented should be given. Although arguably common in the literature, the phrase "under some regularity conditions" is incomplete and not meaningful, and I suggest its removal.
"@srrstats {G2.1, G2.1a} Assertions on types of inputs are clarified throughout the package documentation.": it would be helpful to the reviewers to give specific references to the package documentation. There are a number of similar "throughout the package documentation" statements that render it challenging to assess compliance with standards, beyond a general sense of such.
Possible typo: "na.cation()" in el_glm.R
, line 64.
"@srrstats {RE1.0, RE1.1} Formula interface is used to the formula
argument, and how it is converted to a matrix input is documented as well.": a reference to the function, and section of documentation where this is located would better facilitate review.
"@srrstats {RE4.11} logLik()
method extracts the empirical log-likelihood value which can be interpreted as a measure of goodness-of-fit.": more detail could be provided about how this is to be used as a measure of goodness of fit, perhaps a link to where this is discussed in the package documentation or the preprint?
"@srrstatsNA {G1.5} The package does not have any associated publication.": is it appropriate to cite the preprint here (this might be a question for the Editor)?
_"@srrstatsNA {RE2.3} It is beyond the scope of estimating functions that the package considers. There seems to be no simple way to pass this information to the internal routines. Users can use el_eval()
to manually construct estimating functions with the offset included, but any further method not applicable to the output of el_eval()
."_: this appears to be in response to standard RE2.3 which states: "RE2.3 Where applicable, Regression Software should enable data to be centred (for example, through converting to zero-mean equivalent values; or to z-scores) or offset (for example, to zero-intercept equivalent values) via additional parameters, with the effects of any such parameters clearly documented and tested.". The present package uses lm
and glm
, as well as a standard formula interface, all of which facilitates centering and offsets, so it is not clear why these tasks are out of scope for the present package.
The package includes all the following forms of documentation:
I do not see such a statement. I think this statement could be helpful: a short paragraph describing why empirical likelihood methods are useful in linear and generalized linear models could be informative to the user, since (a) empirical likelihood is a somewhat "specialized" topic and (b) there are other methods available for doing hypothesis tests in linear and generalized linear models, and such methods also rely on limiting Chi-square distributional arguments. It is at least not trivially clear why empirical likelihood is useful and why we'd want a package for it, so I think being very up front about this could better motivate the package.
I see no such statement.
The documentation is almost there, but I am going to leave this unchecked for now. I think adding more detailed "Details" in all core functions, even if this involves a lot of copy and pasting, is a good idea, see (https://r-pkgs.org/man.html)[https://r-pkgs.org/man.html]. For example, ?el_lm
contains very detailed documentation, but ?el_mean
does not. It is helpful to users to have everything on one page, even if the material is repeated across many functions' documentation.
The core algorithms are implemented in C++
. I will do my best to provide appropriate comments here, but I am not an expert on C++
.
EL.cpp
line 103: is there justification for this step-halving mechanism? It is only applied once within the loop, so how can it ensure that the function value is increased? Would a while
loop be more appropriate, or perhaps use of an algorithm that guarantees the desired property?
Related: how often does the Newton step here not increase the function value in practice? If often, perhaps Newton's method is unsuitable here, and a Trust Region procedure could be considered?
EL.cpp
line 181: is explicitly creating a projection matrix necessary? The combination of this being a dense matrix, and relying on explicitly calling inverse
(which may have performance and stability implications), leads me to question whether operations involving this matrix could instead be coded more efficiently. For example, if we have P = I - X(X X^T)^-1 X^T, and you want to compute a matrix/vector product Pv, you could write Pv = v - Xb where (XX^T)b = X^Tv, and b is obtained by solving this system. This avoids the need to ever compute and store an explicit inverse.
What is the purpose of the PsuedoLog class? Commenting this code (lines 347--394) could help facilitate a more thorough review. This looks like a generic mathematical operation-- is there no pre-existing implementation to rely on?
The testing appears thorough, I have no comments.
The visualizations provided seem to serve their stated purpose. No comments.
External: the package is designed for the specific task of computing empirical likelihood ratio tests and intervals for the parameters of linear and generalized linear models. The package is set up to take input similar to lm
and glm
and then call the associated .fit
function from core R
. Some suggestions for broadening scope:
lm
and glm
(and so on) be provided, so that the package provides its output to users who have already fit their models using these alternative methods, and hence compare them?el_lm
is an object of class LM
, a new class from the melt
package. Could this output be made to inherit from the core R
lm
class, to use the associated functionality with which most R
users are familiar? Or related, could the package provide a straightforward manner by which to fit an lm
object from a fitted LM
object? Again, thinking about a user who wishes to compare the two. Similar comments for glm
.Internal: see comments about algorithms above. In general the use of C++
for the numerical analysis seems appropriate, notwithstanding the specific comments above.
Estimated hours spent reviewing: 8
Hi @awstringer1, thank you for your review and feedback. We have made many changes to the main branch based on your comments. The package is in a better shape than before. Please find the responses below.
- "@srrstats {RE1.4} Some of the core distributional assumptions are documented.": this is potentially confusing to users. Either all of the regularity conditions should be clearly documented, or a link to a specific section in a paper where these conditions are clearly documented should be given. Although arguably common in the literature, the phrase "under some regularity conditions" is incomplete and not meaningful, and I suggest its removal.
The phrase is removed. Interested users are directed to the references for more details.
- "@srrstats {G2.1, G2.1a} Assertions on types of inputs are clarified throughout the package documentation.": it would be helpful to the reviewers to give specific references to the package documentation. There are a number of similar "throughout the package documentation" statements that render it challenging to assess compliance with standards, beyond a general sense of such.
It is documented in @param
sections of el_mean.R, where the relevant srr
tags are attached as well. These tags can be also located in any other user-facing functions of the package. All the user-facing functions state the required types of inputs explicitly.
- Possible typo: "na.cation()" in
el_glm.R
, line 64.
Corrected to na.action()
.
- "@srrstats {RE1.0, RE1.1} Formula interface is used to the
formula
argument, and how it is converted to a matrix input is documented as well.": a reference to the function, and section of documentation where this is located would better facilitate review.
It is documented in @details
section of el_lm.R, where the relevant srr
tags are attached as well. Explanation on how the formula
input relates to the model.frame
and model.matrix
objects is not provided since there is nothing special about the usage of formula
in the package. Interested users can refer to the documentation of model.frame()
or model.matrix()
of the stats
package.
- "@srrstats {RE4.11}
logLik()
method extracts the empirical log-likelihood value which can be interpreted as a measure of goodness-of-fit.": more detail could be provided about how this is to be used as a measure of goodness of fit, perhaps a link to where this is discussed in the package documentation or the preprint?
The srr
tag has been moved to logL-methods.R. logL()
and logLR()
extract empirical log-likelihood (ratio) at a specified parameter value. The formulation of empirical likelihood problem can be viewed as allocating probabilities to each observation while minimizing a goodness-of-fit criterion (Cressie–Read discrepancy) subject to constraints. Hence, the computed empirical log-likelihood (ratio) values can also be interpreted as a goodness-of-fit measure. @references sections are added to the package documentation in AllGenerics.R (lines 404--406, 441--443).
- _"@srrstatsNA {RE2.3} It is beyond the scope of estimating functions that the package considers. There seems to be no simple way to pass this information to the internal routines. Users can use
el_eval()
to manually construct estimating functions with the offset included, but any further method not applicable to the output ofel_eval()
.": this appears to be in response to standard RE2.3 which states: "RE2.3 Where applicable, Regression Software should enable data to be centred (for example, through converting to zero-mean equivalent values; or to z-scores) or offset (for example, to zero-intercept equivalent values) via additional parameters, with the effects of any such parameters clearly documented and tested."_. The present package useslm
andglm
, as well as a standard formula interface, all of which facilitates centering and offsets, so it is not clear why these tasks are out of scope for the present package.
el_lm()
and el_glm()
now accepts an optional offset
argument. With offsets, the estimating functions were over-identified in the previous package design, and over-identified estimating equations are beyond the scope of the package. All the related C++ functions have been rewritten to accommodate offsets. The functionality with offset
has been tested internally, and it works with other methods of the package. However, it is still in an experimental stage, and further changes could be made in a future release.
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience in README
I do not see such a statement. I think this statement could be helpful: a short paragraph describing why empirical likelihood methods are useful in linear and generalized linear models could be informative to the user, since (a) empirical likelihood is a somewhat "specialized" topic and (b) there are other methods available for doing hypothesis tests in linear and generalized linear models, and such methods also rely on limiting Chi-square distributional arguments. It is at least not trivially clear why empirical likelihood is useful and why we'd want a package for it, so I think being very up front about this could better motivate the package.
Overview
section of README file is expanded to address the comment.
- [ ] Community guidelines including contribution guidelines in the README or CONTRIBUTING
I see no such statement.
CONTRIBUTING is in the .github folder since it is a GitHub specific file.
The documentation is sufficient to enable general use of the package beyond one specific use case
The documentation is almost there, but I am going to leave this unchecked for now. I think adding more detailed "Details" in all core functions, even if this involves a lot of copy and pasting, is a good idea, see (https://r-pkgs.org/man.html)[https://r-pkgs.org/man.html]. For example,
?el_lm
contains very detailed documentation, but?el_mean
does not. It is helpful to users to have everything on one page, even if the material is repeated across many functions' documentation.
@details
section is added in el_mean.R.
EL.cpp
line 103: is there justification for this step-halving mechanism? It is only applied once within the loop, so how can it ensure that the function value is increased? Would awhile
loop be more appropriate, or perhaps use of an algorithm that guarantees the desired property?Related: how often does the Newton step here not increase the function value in practice? If often, perhaps Newton's method is unsuitable here, and a Trust Region procedure could be considered?
Much of the code for the computation is based on Owen (2001, chapter 12).
Here, set_el()
performs different tasks depending on whether the convex hull constraint is satisfied or not. When the convex hull constraint is satisfied, the objective function is strictly concave without any constraint. Thus, the Newton–Raphson method converges quadratically to the unique solution. Step-halving is a safe-guard against possible numerical instability.
When the convex hull constraint is violated, the PsuedoLog
class (explained below) increases the function value until it reaches a set threshold, or until the algorithm hits the maximum iteration number. Since the optimization problem is infeasible in this case, a while loop for the step-halving often runs indefinitely. The underlying principle is that we do not check the convex hull constraint explicitly for the EL
class. Indirectly inspecting the constraint from the output is much faster and more efficient.
Currently, the package only considers infinitely differentiable estimating functions, and evaluating empirical likelihood itself with Newton–Raphson method is not a challenging task. Other slower but safer methods such as the Quasi-Newton or the Levenberg-Marquardt can be considered in the future when needed.
EL.cpp
line 181: is explicitly creating a projection matrix necessary? The combination of this being a dense matrix, and relying on explicitly callinginverse
(which may have performance and stability implications), leads me to question whether operations involving this matrix could instead be coded more efficiently. For example, if we have P = I - X(X X^T)^-1 X^T, and you want to compute a matrix/vector product Pv, you could write Pv = v - Xb where (XX^T)b = X^Tv, and b is obtained by solving this system. This avoids the need to ever compute and store an explicit inverse.
New proj()
in helpers.cpp (lines 31--35) performs the same projection operation. It does not compute an explicit inverse.
Below is a benchmark result of the same projection operation with a random 5 by 50 LHS matrix. The version without the inverse is much faster, and the gap will grow as the dimension of the matrix increases.
Unit: microseconds
expr min lq mean median uq max neval cld
inverse 9.491 9.906 11.25626 10.7715 12.307 45.142 10000 b
w/o inverse 3.073 3.293 4.49282 3.5470 4.102 7289.781 10000 a
What is the purpose of the PsuedoLog class? Commenting this code (lines 347--394) could help facilitate a more thorough review. This looks like a generic mathematical operation-- is there no pre-existing implementation to rely on?
A brief comment is added to EL.h (lines 129--137).
Numerical computation of empirical log-likelihood ratio statistics heavily relies on a pseudo-log function. It makes the associated optimization problem easier to solve without affecting the solution when the convex hull constraint is satisfied. When the condition is violated, the pseudo-log function also prevents the statistic from being undefined (Inf) so that the algorithm can fail gracefully.
The PsuedoLog
class has relevant member variables and static member functions needed for the computation. We have not found any pre-existing routines for these specific tasks.
Could methods for objects of class
lm
andglm
(and so on) be provided, so that the package provides its output to users who have already fit their models using these alternative methods, and hence compare them?
Providing empirical likelihood methods to existing lm
or glm
classes directly would require dedicated functions like summary_el()
or confint_el()
. Otherwise, they would mask existing ones like summary()
or confint()
, which would not be desirable. Although el_lm()
and el_glm()
purposefully mimic the syntax and output of lm()
and glm()
, these functions perform significance tests, validate inputs, and prepare objects for subsequent methods provided by the package. Since the options provided by el_lm()
and el_glm()
are different from lm()
and glm()
, currently we are not considering extending methods to existing S3 classes.
The output of
el_lm
is an object of classLM
, a new class from themelt
package. Could this output be made to inherit from the coreR
lm
class, to use the associated functionality with which mostR
users are familiar? Or related, could the package provide a straightforward manner by which to fit anlm
object from a fittedLM
object? Again, thinking about a user who wishes to compare the two. Similar comments forglm
.
The idea of inheriting from lm
class was considered in an early stage of development, but it was abandoned due to the following reasons:
Many generics and methods that apply to lm
are not applicable to LM
by the characteristics of empirical likelihood methods.
These include, for example, cooks.distance()
, dfbeta()
, extractAIC()
, anova()
, influence()
, rstudent()
, and vcov()
.
For linear and generalized linear models, some methods would overlap with lm
if LM
inherits from lm
, meaning that they would produce the same output. Examples include coef()
, resid()
, and predict()
. Although these could be added, we strive to keep the package focused on testing and inference, not on estimation or prediction.
Too many generics and methods are applicable to lm
. Over 30 methods are applicable to lm
from the base
and stats
packages.
We plan to add some of the methods to LM
directly in a future release.
Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
We are grateful for your insightful comments and suggestions on the package. Your name and role were added to the updated DESCRIPTION file. We tagged you in NEWS file for the update on projection operation. We would also like to include @pchausse when we receive comments from both reviewers.
:calendar: @awstringer1 you have 2 days left before the due date for your review (2022-08-29).
Many thanks @awstringer1 for your review! You can of course ignore the previous bot message :)
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/550#issuecomment-1218524923 time 8
Logged review for awstringer1 (hours: 8)
Many thanks @markean for answering @awstringer1's comments so quickly. @awstringer1 can tell you whether he is satisfied with them or has additional comments. @pchausse will submit his review soon. Thanks all!
melt-review-chausse-exported.pdf
Here is my review. I was trying to make some of my points with some R codes, but I think it does not work here. I will also attach the pdf.
My review is based on Version 1.7.0 from CRAN.
[X] This package complies with a sufficient number of standards for a silver badge
[X] This grade of badge is the same as what the authors wanted to achieve
The following standards currently deemed non-applicable (through tags
of @srrstatsNA
) could potentially be applied to future versions of
this software:
G1.6 Software should include code necessary to compare performance
claims with alternative implementations in other R packages.: Other
packages compute empirical likelihood tests and confidence
intervals. We would like to see how the melt
package compares with
the others. For example, my momentfit (or gmm) package produces
the same type of confidence intervals or confidence areas for linear
models and the mean. The package
emplik now includes
regression models.
This is clearly not a package suited for a unique application. the
algorithm to solve restricted empirical likelihood models is efficient
and can be applied to many more models not yet implemented by the
package. For example, The author seems to discard the idea of applying
the method to over-identified models (from the NOT TO DO section of
CONTRIBUTING.md), but it could easily be implemented since constrained
empirical likelihood is in fact over-identified. I could also see
other packages being based on melt
for specific applications.
"@srrstats {RE1.4} Some of the core distributional assumptions are
documented.": The list of regularity conditions is not that long. We
do see some distribution assumptions in the EL
documentation where
the EL method is presented for general estimating equations
$E[g(X,\theta)]=0$, but it would be important to add at least the
assumption on the variance of $g(X,\theta)$. Users using only
el_mean
or el_sd
should know what distribution assumptions are
required. For these two functions, the detail section is empty. At
least, we should see what the estimating equations are and then be
given a link to the EL documentation for more details.
"@srrstats {RE4.11} logLik()
method extracts the empirical
log-likelihood value which can be interpreted as a measure of
goodness-of-fit": It is really not clear how to interpret the
value. Can we have a more detailed explanation? First, it is the
empirical likelihood value of what? For example, if we estimate a
linear model:
set.seed(112233)
x <- rnorm(40); y <- rnorm(40)
fit <- el_lm(y~x)
print(logLik(fit), digits=10)
This is the log-likelihood of what? If I understand correctly (not
clear from the documentation), It is not the one from the model
that restrict the slope coefficient to be 0, which is what we have
in the slot logl
or if we sum the log of the implied
probabilities:
print(fit@logl, digits=10)
print(sum(fit@logp), digits=10)
After more testing, it seems to be the one from the unrestricted model for which the implied probabilities are all equal to $1/n$:
print(40*log(1/40), digits=10)
I think this needs to be explained in more details.
"@srrstats {RE6.0} plot
method is available for an ELD
object
that is returned by the eld()
method.": The authors may consider
adding some diagnostic-type plot methods for EL
(or any other)
class as we have for lm
objects. It is great to have a plot for
eld()
to detect outliers, but it would be even better I think if
the plot was applied directly to EL
class. For example, a Cook
Distance exists separately for lm
, but we can plot it directly
from lm
objects by selecting which=4
:
## clearly not the best method, but you see the idea
setMethod("plot", "EL", function(x, which = 1, ...) {
if (which==1)
plot(eld(x))
else
cat("other diagnostic plots if any\n")
})
```{r, out.width='60%', fig.align='center', fig.height=5}
plot(fit, which=1)
This is a very well designed package. The outputs of the different estimating or testing methods are easy to read and the algorithms seem to be quite efficient. The following are mostly suggestions that in my view could improve the package.
The package includes all the following forms of documentation:
I think we should see a more detailed presentation of why one would
need to use the melt
package to perform hypothesis tests or to
construct confidence intervals. For example, when the empirical
likelihood starts to produce confidence intervals with better coverage
than lm
does for linear regression? In the example section of
el_lm
, we see an application with iid and normally distributed
variables. In this case, even when the same size is very small, a
least square confidence interval has an exact coverage while EL does
not. So why should we use the melt
package?
There is room for improvement here.
optim
slotWe see that a slot optim
exists in the LM object created by el_lm
through the CEL-class. But it is not clear from the documentation what
these values are.
set.seed(5649)
df <- data.frame(y = rnorm(50), x = rnorm(50))
fit <- el_lm(y ~ x, df)
slotNames(fit)
From the CEL-class documentation, it says that par
is $\hat{\theta}$
(subject to the constraints which we do not have here) and lambda
is
$\hat{\lambda}$. But it is not:
coef(fit)
fit@optim$par
fit@optim$lambda
What is par
? What is lambda
? Since the model is just-identified,
the latter should be a vector of zeroes when there are no
constraints. Are they the estimates when we restrict all coefficients
to be zero? (except for the intercept). Can that be added to the
documentation? If we explore further, we learn that it seems to be the
case:
set.seed(5649)
df2 <- data.frame(y = rnorm(50), x = rnorm(50), z = rnorm(50))
fit2 <- el_lm(y ~ x+z, df2)
fit2@optim$par
I think it is important information that may be useful for some users.
In the el_lm
and el_glm
(maybe others?) it is written in the
details that the test for the null
$H_0:~\theta_1=\theta2=\cdots=\theta{p-1}=0$ is in optim
and the
test parameter by parameter is in sigTests
, but there is no test
result in optim
. It only includes the parameters, the lambdas, the
number of iterations and the convergence code. Also, why is it
important to return the implied probabilities (the logp
slot) and
the lambdas for the joint test and not for the individual ones?
The slots are documented in EL, but it is not easy to get it for other
objects. For example, in the el_glm
documentation, the Value is "An
object of class GLM". If we click in it, we only see that it inherits
from LM. If we click on it, we see some slots, but not all. It
inherits from CEL, so we click on it and only the optim
slot is
defined. We finally get the description of all slots by clicking on
EL. It is not an easy process from someone like me who only work in
terminals. It requires to call help()
several times. It would be
helpful if the slots were defined for all objects that inherits from
EL.
Also, it is not always clear what the slots are. In EL, we see "A
numeric vector of the log probabilities obtained from empirical
likelihood" for the logp
slot. But which probabilities? I figured
out for el_lm
that it is for the restricted model when all
parameters are 0 except the intercept. It should be specified. It is
the same for logl
, loglr
, etc. This is something we see across the
different help files. It is another reason the author should define
all slots for all functions that return an EL object.
I would expect a little more details about the EL method. It is well
presented in the EL documentation for general estimating equations
$g(X,\theta)$, but I think it would be helpful for the method to also
be explained in el_lm
, el_glm
, el_sd
, el_mean
and even
el_eval
. The EL method does not need to be repeated, but at least we
should see the specific estimating equation used for each model and a
link to EL for the details. For example, it is not clear to me which
estimating equation is used for the standard deviation or for
generalized linear models.
It is not clear what is the purpose of the el_eval
. It says that it
is for general estimating equation, but the input is a matrix, so it
is an estimating equation without parameters. The exact same result
can be obtained with 'el_mean': the following two produced identical
optim
, logp
, logl
and loglr
:
x <- matrix(rnorm(200), 50, 4)
fit <- el_eval(x)
fit2 <- el_mean(x,rep(0,4))
So, why el_eval
does not simply call el_mean
? If the output has to
be a list, it can returned as a such.
As a suggestion, the package can be greatly improved by adding more
general nonlinear regression models. In el_glm
is a nonlinear model,
so it should bot be to difficult to include nonlinear regression like
in nls
or even more general cases. With a method to estimate general
nonlinear models, it becomes straightforward to test nonlinear
restrictions, if they are limited to restrictions that can be written as
$H_0:~\thetai = f(\theta{-i})$. if not too difficult to implement
because it only implies replacing $\theta_i$ by $f(\theta_i)$ in the
estimating equation function $g(X,\theta)$.
Many recent applications rely on nonlinear estimating equations that are often over-identified. It is the case of all methods recently developed in the missing value literature (or the causal effect literature, which is just a special case).
confint
methodIt seems that all methods for the EL
, QGLM
and SD
classes are
almost identical and can be reduced to a single method applied to a
union class. I can see some benefits from avoiding repeating the same
codes in several methods, especially if changes need to be applied to
them.
Why the getMethodEL(object)
is used to get the type of method for
the classes EL
and QGLM
and not for the class SD
? Is it because
the getMethodEL
method does not exist for this class? For the
latter, the type is explicitly added (see line 214 of
confint-methods.R)
elt
methodWould be nice to see the hypothesis being tested when the result is
printed (see for example the linearHypothesis
function from the
car package).
Don't see any issue with the different tests.
No issue here other than the above suggestion about the plot
method.
lm
glm
and nls
. We can then easily compare the results. Estimated hours spent reviewing: 16
Thanks so much @pchausse for your review!
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/550#issuecomment-1235901756 time 16
Logged review for pchausse (hours: 16)
Hi @pchausse, thank you for your review and feedback. Your comments have greatly improved the documentation. Some notable updates on the code include: a new logProb()
function is added to extract the logp
slot, plot()
is directly applicable to EL
class, elt()
allows a symbolic description for specifying a hypothesis, print()
shows the tested hypothesis when applied to ELT
class. The repository currently has v1.7.0.9600 of the package. The CRAN version will be updated after the review process. Please find the responses below.
G1.6 Software should include code necessary to compare performance claims with alternative implementations in other R packages.: Other packages compute empirical likelihood tests and confidence intervals. We would like to see how the melt package compares with the others. For example, my momentfit (or gmm) package produces the same type of confidence intervals or confidence areas for linear models and the mean. The package emplik now includes regression models.
Since we do not make performance claims compared to other packages, please check the benchmark result below for reference only. We constructed EL confidence intervals for the mean with 100 observations using the melt, momentfit, and emplik packages. Default settings were used for each package.
library(microbenchmark)
library(melt)
library(emplik)
library(momentfit)
set.seed(134095)
x <- rnorm(100)
fit <- el_mean(x, 0)
out <- microbenchmark(
ci_melt = confint(fit),
ci_momentfit = confint(x, gelType = "EL"),
ci_emplik = findUL(fun = function(theta, x) {
el.test(x, mu = theta)
}, MLE = mean(x), x = x),
times = 1000
)
print(out, unit = "relative")
Unit: relative
expr min lq mean median uq max neval cld
ci_melt 1.00000 1.000000 1.000000 1.000000 1.000000 1.000000 1000 a
ci_momentfit 6.18093 5.738152 5.672995 5.549923 5.304345 5.693226 1000 b
ci_emplik 33.49088 29.969031 31.332941 30.790802 31.826105 69.604799 1000 c
"@srrstats {RE1.4} Some of the core distributional assumptions are documented.": The list of regularity conditions is not that long. We do see some distribution assumptions in the
EL
documentation where the EL method is presented for general estimating equations $E[g(X, \theta)] = 0$, but it would be important to add at least the assumption on the variance of $g(X, \theta)$. Users using onlyel_mean
orel_sd
should know what distribution assumptions are required. For these two functions, the detail section is empty. At least, we should see what the estimating equations are and then be given a link to the EL documentation for more details.
The updated documentation of el_mean()
and el_sd()
state in Details
section the optimization problems with the estimating functions. Assumptions on the variance are stated as well. The See Also
section gives a link to the EL
documentation. Links to relevant documentation have been added to all the other user-facing exported functions as well.
"@srrstats {RE4.11}
logLik()
method extracts the empirical log-likelihood value which can be interpreted as a measure of goodness-of-fit": It is really not clear how to interpret the value. Can we have a more detailed explanation?
logLik()
only gives $-n\log n$ with $p_i = 1/n$ for all $i$. We have expanded the Details
section of logLik()
(line 436 of AllGenerics.R) with more details. logLik()
was originally designed to offer other functionality, but its role has been superseded by logL()
. The srr
tag has also been moved to logL-methods.R.
"@srrstats {RE6.0}
plot
method is available for anELD
object that is returned by theeld()
method.": The authors may consider adding some diagnostic-type plot methods forEL
(or any other) class as we have forlm
objects. It is great to have a plot foreld()
to detect outliers, but it would be even better I think if the plot was applied directly toEL
class. For example, a Cook Distance exists separately forlm
, but we can plot it directly fromlm
objects by selectingwhich=4
:
eld()
is now applicable to EL
objects directly (line 35 of plot-methods.R). It implicitly calls eld()
since the package currently provides no other diagnostic plot. More options like which
can be added in a future version.
I think we should see a more detailed presentation of why one would need to use the
melt
package to perform hypothesis tests or to construct confidence intervals. For example, when the empirical likelihood starts to produce confidence intervals with better coverage thanlm
does for linear regression? In the example section ofel_lm
, we see an application with iid and normally distributed variables. In this case, even when the same size is very small, a least square confidence interval has an exact coverage while EL does not. So why should we use themelt
package?
Please check the updated README file. We have added more details in the Overview
section. The new Main functions
section introduces some available functions with short descriptions. More examples have also been added in the Usage
section. The examples of el_lm()
now rely on a new internal dataset thiamethoxam
.
optim
slotWhat is
par
? What islambda
? Since the model is just-identified, the latter should be a vector of zeroes when there are no constraints. Are they the estimates when we restrict all coefficients to be zero? (except for the intercept). Can that be added to the documentation?
The updated Details
sections of LM
, GLM
, and QGLM
contain more information on the optim
slot and its components. The following slots are all related to the overall test: optim
, logp
, logl
, loglr
, statistic
, df
, and pval
.
In the
el_lm
andel_glm
(maybe others?) it is written in the details that the test for the null $H_0 : \theta_1 = \theta2 = \cdots = \theta{p - 1} = 0$ is inoptim
and the test parameter by parameter is insigTests
, but there is no test result inoptim
. It only includes the parameters, the lambdas, the number of iterations and the convergence code. Also, why is it important to return the implied probabilities (thelogp
slot) and the lambdas for the joint test and not for the individual ones?
Besides optim
, the overall test result needs to involve logp
, logl
, loglr
, statistic
, df
, and pval
. Since they are all derived from optim
, we stated that the overall test result is in optim
. In order to avoid confusion, we have removed the original sentence "The test results are returned as optim
and sigTests
, respectively." from el_lm()
and el_glm()
. The updated documentation of el_lm()
and el_glm()
now focuses on what it does rather than specific slots of its return value which are not explained.
We can include the full results(optim
, logp
, logl
, loglr
, statistic
, df
, and pval
) for each significance test, but it will make the sigTests
slot have nested structure with possibly large size. Currently, sigTest()
applied to SummaryLM
returns the estimates, the test statistics, and the $p$-values. elt()
can be if the user wants more information on a specific test.
Additionally, we have added a new accessor method logProb()
that extracts logp
.
The slots are documented in EL, but it is not easy to get it for other objects. For example, in the
el_glm
documentation, the Value is "An object of class GLM". If we click in it, we only see that it inherits from LM. If we click on it, we see some slots, but not all. It inherits from CEL, so we click on it and only theoptim
slot is defined. We finally get the description of all slots by clicking on EL. It is not an easy process from someone like me who only work in terminals. It requires to callhelp()
several times. It would be helpful if the slots were defined for all objects that inherits from EL.Also, it is not always clear what the slots are. In EL, we see "A numeric vector of the log probabilities obtained from empirical likelihood" for the
logp
slot. But which probabilities? I figured out forel_lm
that it is for the restricted model when all parameters are 0 except the intercept. It should be specified. It is the same forlogl
,loglr
, etc. This is something we see across the different help files. It is another reason the author should define all slots for all functions that return an EL object.
The updated documentation (in AllClasses.R) lists all slots for all classes with some class-specific details.
I would expect a little more details about the EL method. It is well presented in the EL documentation for general estimating equations $g(X, \theta)$, but I think it would be helpful for the method to also be explained in
el_lm
,el_glm
,el_sd
,el_mean
and evenel_eval
. The EL method does not need to be repeated, but at least we should see the specific estimating equation used for each model and a link to EL for the details. For example, it is not clear to me which estimating equation is used for the standard deviation or for generalized linear models.
Each model-building function (including el_eval()
) now states estimating equations and the associated EL problem in the Details
section.
el_eval()
is a function that can be used to compute EL when the user has a specific estimating function not covered by the package. For a just-identified estimating function $g$ with a specified parameter $\theta$, the user needs to prepare the input matrix g
computed with data. el_eval()
only solves the optimization problem and returns the result. Since the main functions, such as elt()
and elmt()
, cannot be used with an arbitrary estimating equation, the return value is a list
with the same components of EL
.
As a suggestion, the package can be greatly improved by adding more general nonlinear regression models. In
el_glm
is a nonlinear model, so it should bot be to difficult to include nonlinear regression like innls
or even more general cases. With a method to estimate general nonlinear models, it becomes straightforward to test nonlinear restrictions, if they are limited to restrictions that can be written as $H_0: \thetai = f(\theta{-i})$. if not too difficult to implement because it only implies replacing $\theta_i$ by $f(\theta_i)$ in the estimating equation function $g(X, \theta)$.
Although we agree that nls
is probably the most natural way for an extension of the package, the development is currently focused on adding families and link functions for el_glm()
. The next release version (v1.8.0) will be shipped with identity
link option for quasipoisson
family, which was added recently in C++ files. So it will take several more releases for el_glm()
to provide most of the options in our roadmap.
Based on your comments above, nls
methods could be implemented by improving el_eval()
or making new functions such as el_nls()
or el_g()
. These functions should take a formula
or function
argument for estimating functions and an additional argument for gradient functions. Then, in principle, we could construct confidence intervals or perform hypothesis testing. The main concern would be performance. The package exclusively relies on the Eigen C++ library and OpenMP directives for parallel computing, and user-defined R functions have no simple thread-safe way of accessing them. We will revisit this issue when the development of el_glm()
is finished.
confint
methodIt seems that all methods for the
EL
,QGLM
andSD
classes are almost identical and can be reduced to a single method applied to a union class. I can see some benefits from avoiding repeating the same codes in several methods, especially if changes need to be applied to them.Why the getMethodEL(object) is used to get the type of method for the classes EL and
QGLM
and not for the classSD
? Is it because the getMethodEL method does not exist for this class? For the latter, the type is explicitly added (see line 214 of confint-methods.R)
confint()
methods for QGLM
and SD
classes have been removed. One method for EL
class can be applied to all other classes using a new internal generic getEstimates()
(line 663 of AllGenerics.R). We have applied the same procedure to confreg()
and elmt()
methods and reduced the amount of duplicate code.
elt
methodWould be nice to see the hypothesis being tested when the result is printed (see for example the linearHypothesis function from the car package).
pirnt()
method now prints the tested hypothesis. A new internal function describe_hypothesis()
in describe_hypothesis.R translates the hypothesis matrix into a character vector for printing. Additionally, elt()
now accepts a character vector for the argument lhs
, allowing a symbolic description of a hypothesis. The updated documentation of elt.R shows different ways to specify a hypothesis with new examples.
Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
Please check the updated DESCRIPTION file. We have also added you and @awstringer1 in the Acknowledgments
section of the pdf vignette in vignettes/article.pdf
.
Many thanks @markean for working on the comments so quickly! @awstringer1 and @pchausse, can you please take a look and say whether your comments were well addressed or you have additional questions? Thanks!
Thanks to @markean for the response. I was waiting until both reviews and responses were in, to avoid redundancy. All my major comments have been addressed.
My only minor comment is about the optimization. I don't think it is true that "the objective function is strictly concave without any constraint. Thus, the Newton–Raphson method converges quadratically to the unique solution."– see Nocedal and Wright's Numerical Optimization, Theorem 3.5, page 46. There are non-trivial initial conditions, namely that the starting value has to be "close enough" to the optimum, else it will diverge. If Newton with step-halving has worked so far, then that's good enough for me (and perhaps there is some additional theory here about why it works in this context that I do not know), but if it ever becomes a practical problem in the future, the authors could consider trust region optimization. See Chapter 4 in that same book, or the trust
or trustOptim
R
packages. I have found this very useful in cases where Newton was not working satisfactorily.
Thanks @markean for your response. All my comments have also been addressed.
Thanks again, @pchausse and @awstringer1.
As for the optimization, it can be seen from Art Owen's Empirical Likelihood (2001, page 63) that the objective function is twice continuously differentiable with the Lipschitz continuous Hessian by using a pseudo-logarithm function. The initial $\lambda$ is set to the zero vector. This is related to the theoretical result that $\lambda(\theta_0) = o_p(n^{-1/2})$, where $\theta_0$ is the true value. Although there have been some suggestions for using a data-dependent initial value for $\lambda$ (e.g., Hjort, McKeague, and van Keilegom 2009, page 1104), we have not found non-convergence issues during the unit tests when the convex hull constraint is satisfied. We will keep track of the issue and consider other alternatives as we add more estimating functions and methods, including the Trust Region procedure.
Hi @Paula-Moraga. Can we proceed with this at your discretion if the reviewers have no further comments?
Yes! Thanks so much @markean @awstringer1 @pchausse for your time and work. Very happy to approve the melt package!
@ropensci-review-bot approve melt
Approved! Thanks @markean for submitting and @pchausse, @awstringer1 for your reviews! :grin:
To-dos:
@ropensci-review-bot invite me to ropensci/<package-name>
which will re-send an invitation.@ropensci-review-bot finalize transfer of <package-name>
where <package-name>
is the repo/package name. This will give you admin access back.pkgdown
website and are ok relying only on rOpenSci central docs building and branding,
pkgdown
website with a redirecting pagehttps://docs.ropensci.org/package_name
URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
codemetar::write_codemeta()
in the root of your package.install.packages("<package-name>", repos = "https://ropensci.r-universe.dev")
thanks to R-universe.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent).
Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.
We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.
Last but not least, you can volunteer as a reviewer via filling a short form.
Thank you @Paula-Moraga!
Date accepted: 2022-09-20
Submitting Author Name: Eunseop Kim Submitting Author Github Handle: !--author1-->@markean<!--end-author1-- Repository: https://github.com/markean/melt Version submitted: 1.6.0.9000 Submission type: Stats Badge grade: silver Editor: !--editor-->@Paula-Moraga<!--end-editor-- Reviewers: @pchausse, @awstringer1
Due date for @pchausse: 2022-09-05 Due date for @awstringer1: 2022-08-29Archive: TBD Version accepted: TBD Language: en
Pre-submission Inquiry
General Information
Who is the target audience and what are scientific applications of this package?
Paste your responses to our General Standard G1.1 here, describing whether your software is: The package attempts the first implementation of the nested bilevel optimization approach within R to compute constrained empirical likelihood. The inner layer Newton-Raphson method for empirical likelihood is written in C++, enabling faster computation than other routines written in R.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? Yes.
Badging
What grade of badge are you aiming for? (bronze, silver, gold) Silver.
If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold) 1) Compliance with a good number of standards beyond those identified as minimally necessary. 2) Have a demonstrated generality of usage beyond one single envisioned use case. In my opinion, the generality comes from the applicability of empirical likelihood methods to (generalized) linear models and the ability to test linear hypotheses of choice.
Technical checks
Confirm each of the following by checking the box.
autotest
checks on the package, and ensured no tests fail.srr_stats_pre_submit()
function confirms this package may be submitted.pkgcheck()
function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.This package:
Publication options
Code of conduct