Capybara #645

Open pachadotdev opened 1 month ago

pachadotdev commented 1 month ago

Submitting Author Name: Mauricio Vargas Sepulveda Submitting Author Github Handle: !--author1-->@pachadotdev<!--end-author1-- Repository: Version submitted: Submission type: Standard Editor: TBD Reviewers: TBD

Archive: TBD Version accepted: TBD Language: en

Package: capybara
Type: Package
Title: Fast and Memory Efficient Fitting of Linear Models With High-Dimensional
    Fixed Effects
Version: 0.5.1
Authors@R: c(
        given = "Mauricio",
        family = "Vargas Sepulveda",
        role = c("aut", "cre"),
        email = "",
        comment = c(ORCID = "0000-0003-1017-7574"))
    testthat (>= 3.0.0),
Depends: R(>= 3.5.0)
Description: Fast and user-friendly estimation of generalized linear models with
    multiple fixed effects and cluster the standard errors. The method to obtain
    the estimated fixed-effects coefficients is based on Stammann (2018) 
    <> and Gaure (2013)
License: Apache License (>= 2)
LazyData: true
RoxygenNote: 7.3.1
Encoding: UTF-8
NeedsCompilation: yes
LinkingTo: cpp11, cpp11armadillo
VignetteBuilder: knitr
Config/testthat/edition: 3
Roxygen: list(markdown = TRUE, roclets = c("namespace", "rd", "srr::srr_stats_roclet"))


This helps to estimate linear models with many fixed effects and implement and efficient algorithm that saves time and memory. In addition, this features a novel wrapper for Armadillo (C++).

People (mostly) in the social sciences that need multiple controls in their models. This is especially useful in Economics and International Relations.

Fixest, alpaca. This one has different design choices and a reduced number of dependencies. One attribute is that it adds multiple unit tests.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code]( - (*Scope: Do consider MEE's [Aims and Scope]( for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

ropensci-review-bot commented 1 month ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 1 month ago


Editor check started


mpadge commented 3 weeks ago

@pachadotdev Checks failed because of the missing comma in your DESCRIPTION file which I see you've fixed in your latest commit. Ask the bot to check package again and it should work.

pachadotdev commented 3 weeks ago

@ropensci-review-bot check package

ropensci-review-bot commented 3 weeks ago

Thanks, about to send the query.

ropensci-review-bot commented 3 weeks ago


Editor check started


ropensci-review-bot commented 3 weeks ago

Checks for capybara (v0.5.1)

git hash: 55b6c2d8

Important: All failing checks above must be addressed prior to proceeding

(Checks marked with :eyes: may be optionally addressed.)

Package License: Apache License (>= 2)

1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

:heavy_multiplication_x: This package still has TODO standards and can not be submitted All applicable standards [v0.2.0] have been documented in this package (68 complied with; 1 N/A standards) Statistical standards should be documented in most package files, yet are mostly only documented in one file.

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.

2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:--------------|------:| |internal |base | 210| |internal |capybara | 111| |internal |utils | 47| |internal |grDevices | 13| |internal |graphics | 2| |imports |stats | 115| |imports |rlang | 22| |imports |magrittr | 6| |imports |dplyr | 4| |imports |MASS | 4| |imports |Formula | 1| |suggests |fixest | NA| |suggests |knitr | NA| |suggests |rmarkdown | NA| |suggests |testthat | NA| |suggests |tidyr | NA| |linking_to |cpp11 | NA| |linking_to |cpp11armadillo | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.


length (13), for (11), list (11), (10), abs (9), beta (9), attr (8), drop (8), max (8), c (7), mapply (7), names (7), sum (7), nrow (6), ncol (5), matrix (4), trace (4), as.logical (3), cbind (3), class (3), diag (3), getOption (3), lapply (3), rep (3), sqrt (3), structure (3), try (3), with (3), all (2), apply (2), as.matrix (2), integer (2), is.finite (2), letters (2), levels (2), mean (2), nchar (2), numeric (2), paste0 (2), replace (2), sample (2), unlist (2), vapply (2), (1), as.list (1), as.vector (1), colnames (1), colSums (1), data.frame (1), inherits (1), min (1), order (1), rownames (1), summary (1), suppressWarnings (1), unname (1)


family (36), formula (25), nobs (14), model.matrix (10), deviance (4), terms (4), Gamma (3), offset (3), pnorm (3), vcov (3), weights (3), coefficients (2), D (2), C (1), poisson (1), predict (1)


get_index_list_ (8), crossprod_ (5), feglm_fit_ (3), group_sums_ (3), nobs_ (3), partial_mu_eta_ (3), solve_beta_ (3), solve_y_ (3), center_variables_ (2), check_factor_ (2), feglm (2), gamma_ (2), get_alpha_ (2), getScoreMatrix (2), group_sums_cov_ (2), init_theta_ (2), inv_ (2), solve_bias_ (2), solve_eta_ (2), solve_eta2_ (2), sqrt_ (2), temp_var_ (2), apes (1), augment.feglm (1), bias_corr (1), check_control_ (1), check_data_ (1), check_family_ (1), check_formula_ (1), check_linear_dependence_ (1), check_response_ (1), check_weights_ (1), coef.apes (1), coef.feglm (1), coef.felm (1), coef.summary.apes (1), coef.summary.feglm (1), coef.summary.felm (1), drop_by_link_type_ (1), feglm_control (1), feglm_offset_ (1), felm (1), felm_fit_ (1), fenegbin (1), fepoisson (1), fitted.feglm (1), fitted.felm (1), fixed_effects (1), glance.feglm (1), glance.felm (1), group_sums_spectral_ (1), group_sums_var_ (1), model_frame_ (1), model_response_ (1), pairwise_cor_ (1), predict.feglm (1), predict.felm (1), print.apes (1), print.feglm (1), print.felm (1), print.summary.apes (1), print.summary.feglm (1), print.summary.felm (1), rank_ (1), sandwich_ (1), start_guesses_ (1), summary_estimates_ (1), summary_family_ (1), summary_fisher_ (1), summary_formula_ (1), summary.apes (1), summary.feglm (1), summary.felm (1), update_nu_ (1)


data (46), combn (1)


sym (22)


cm (13)


%>% (6)


all_of (2), select (2)


negative.binomial (2), (2)


pie (2)


Formula (1)

3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in C++ (26% in 6 files) and R (74% in 23 files) - 1 authors - 1 vignette - 1 internal data file - 7 imported packages - 14 exported functions (median 22 lines of code) - 163 non-exported functions in R (median 6 lines of code) - 45 R functions (median 5 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function]( The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|------:|----------:|:----------| |files_R | 23| 84.5| | |files_src | 6| 90.2| | |files_vignettes | 2| 85.7| | |files_tests | 6| 84.4| | |loc_R | 1554| 78.9| | |loc_src | 549| 48.8| | |loc_vignettes | 81| 19.3| | |loc_tests | 228| 58.3| | |num_vignettes | 1| 64.8| | |data_size_total | 285103| 89.1| | |data_size_median | 285103| 95.3|TRUE | |n_fns_r | 177| 88.0| | |n_fns_r_exported | 14| 56.3| | |n_fns_r_not_exported | 163| 91.1| | |n_fns_src | 45| 64.7| | |n_fns_per_file_r | 4| 63.6| | |n_fns_per_file_src | 7| 66.4| | |num_params_per_fn | 3| 33.6| | |loc_per_fn_r | 7| 16.0| | |loc_per_fn_r_exp | 22| 52.1| | |loc_per_fn_r_not_exp | 6| 13.8| | |loc_per_fn_src | 5| 5.0|TRUE | |rel_whitespace_R | 20| 80.6| | |rel_whitespace_src | 24| 55.8| | |rel_whitespace_vignettes | 23| 12.6| | |rel_whitespace_tests | 22| 56.4| | |doclines_per_fn_exp | 34| 41.6| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 145| 84.8| | ---

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

4. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check.yaml](]( **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 9340075153|format_check |failure |55b6c2 | 35|2024-06-02 | | 9340075080|pages build and deployment |success |55b6c2 | 57|2024-06-02 | | 9340075152|R-CMD-check |success |55b6c2 | 55|2024-06-02 | | 9340075157|test-coverage |success |55b6c2 | 34|2024-06-02 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck]( R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 9.0Mb sub-directories of 1Mb or more: libs 8.2Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr]( Package coverage: 37.48 The following files are not completely covered by tests: file | coverage --- | --- R/apes.R | 24.26% R/bias_corr.R | 0% R/felm.R | 0% R/fenegbin.R | 0% R/fixed_effects.R | 0% R/generics_augment.R | 0% R/generics_coef.R | 0% R/generics_fitted.R | 0% R/generics_glance.R | 0% R/generics_print.R | 0% R/generics_summary.R | 39.66% R/generics_tidy.R | 0% R/generics_vcov.R | 52.63% R/helpers.R | 63.19% R/internals.R | 54.82% src/02_get_alpha.cpp | 0% src/03_group_sums.cpp | 0% #### Cyclocomplexity with [cyclocomp]( The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- apes | 38 feglm_fit_ | 26 feglm_offset_ | 21 bias_corr | 19 vcov.feglm | 18 fenegbin | 15 #### Static code analyses with [lintr]( [lintr]( found the following 124 potential issues: message | number of times --- | --- Avoid 1:ncol(...) expressions, use seq_len. | 2 Avoid library() and require() calls in packages | 1 Lines should not be more than 80 characters. This line is 103 characters. | 1 Lines should not be more than 80 characters. This line is 104 characters. | 1 Lines should not be more than 80 characters. This line is 105 characters. | 1 Lines should not be more than 80 characters. This line is 106 characters. | 2 Lines should not be more than 80 characters. This line is 108 characters. | 1 Lines should not be more than 80 characters. This line is 110 characters. | 2 Lines should not be more than 80 characters. This line is 113 characters. | 1 Lines should not be more than 80 characters. This line is 114 characters. | 2 Lines should not be more than 80 characters. This line is 115 characters. | 1 Lines should not be more than 80 characters. This line is 116 characters. | 1 Lines should not be more than 80 characters. This line is 117 characters. | 2 Lines should not be more than 80 characters. This line is 121 characters. | 1 Lines should not be more than 80 characters. This line is 122 characters. | 2 Lines should not be more than 80 characters. This line is 126 characters. | 1 Lines should not be more than 80 characters. This line is 128 characters. | 1 Lines should not be more than 80 characters. This line is 129 characters. | 1 Lines should not be more than 80 characters. This line is 131 characters. | 2 Lines should not be more than 80 characters. This line is 132 characters. | 1 Lines should not be more than 80 characters. This line is 133 characters. | 1 Lines should not be more than 80 characters. This line is 134 characters. | 1 Lines should not be more than 80 characters. This line is 136 characters. | 1 Lines should not be more than 80 characters. This line is 141 characters. | 1 Lines should not be more than 80 characters. This line is 142 characters. | 1 Lines should not be more than 80 characters. This line is 144 characters. | 1 Lines should not be more than 80 characters. This line is 145 characters. | 1 Lines should not be more than 80 characters. This line is 147 characters. | 1 Lines should not be more than 80 characters. This line is 151 characters. | 1 Lines should not be more than 80 characters. This line is 154 characters. | 1 Lines should not be more than 80 characters. This line is 155 characters. | 1 Lines should not be more than 80 characters. This line is 156 characters. | 1 Lines should not be more than 80 characters. This line is 161 characters. | 2 Lines should not be more than 80 characters. This line is 163 characters. | 1 Lines should not be more than 80 characters. This line is 165 characters. | 2 Lines should not be more than 80 characters. This line is 166 characters. | 1 Lines should not be more than 80 characters. This line is 169 characters. | 1 Lines should not be more than 80 characters. This line is 170 characters. | 1 Lines should not be more than 80 characters. This line is 174 characters. | 3 Lines should not be more than 80 characters. This line is 176 characters. | 1 Lines should not be more than 80 characters. This line is 178 characters. | 1 Lines should not be more than 80 characters. This line is 184 characters. | 4 Lines should not be more than 80 characters. This line is 185 characters. | 1 Lines should not be more than 80 characters. This line is 193 characters. | 1 Lines should not be more than 80 characters. This line is 195 characters. | 2 Lines should not be more than 80 characters. This line is 199 characters. | 1 Lines should not be more than 80 characters. This line is 214 characters. | 1 Lines should not be more than 80 characters. This line is 217 characters. | 1 Lines should not be more than 80 characters. This line is 219 characters. | 1 Lines should not be more than 80 characters. This line is 220 characters. | 1 Lines should not be more than 80 characters. This line is 223 characters. | 1 Lines should not be more than 80 characters. This line is 225 characters. | 1 Lines should not be more than 80 characters. This line is 229 characters. | 1 Lines should not be more than 80 characters. This line is 230 characters. | 1 Lines should not be more than 80 characters. This line is 232 characters. | 1 Lines should not be more than 80 characters. This line is 244 characters. | 1 Lines should not be more than 80 characters. This line is 247 characters. | 1 Lines should not be more than 80 characters. This line is 250 characters. | 1 Lines should not be more than 80 characters. This line is 257 characters. | 1 Lines should not be more than 80 characters. This line is 261 characters. | 1 Lines should not be more than 80 characters. This line is 269 characters. | 2 Lines should not be more than 80 characters. This line is 277 characters. | 2 Lines should not be more than 80 characters. This line is 279 characters. | 1 Lines should not be more than 80 characters. This line is 282 characters. | 1 Lines should not be more than 80 characters. This line is 287 characters. | 1 Lines should not be more than 80 characters. This line is 288 characters. | 1 Lines should not be more than 80 characters. This line is 291 characters. | 1 Lines should not be more than 80 characters. This line is 293 characters. | 1 Lines should not be more than 80 characters. This line is 305 characters. | 1 Lines should not be more than 80 characters. This line is 318 characters. | 1 Lines should not be more than 80 characters. This line is 320 characters. | 2 Lines should not be more than 80 characters. This line is 333 characters. | 1 Lines should not be more than 80 characters. This line is 334 characters. | 1 Lines should not be more than 80 characters. This line is 339 characters. | 1 Lines should not be more than 80 characters. This line is 341 characters. | 1 Lines should not be more than 80 characters. This line is 360 characters. | 1 Lines should not be more than 80 characters. This line is 363 characters. | 1 Lines should not be more than 80 characters. This line is 385 characters. | 1 Lines should not be more than 80 characters. This line is 389 characters. | 1 Lines should not be more than 80 characters. This line is 396 characters. | 1 Lines should not be more than 80 characters. This line is 441 characters. | 1 Lines should not be more than 80 characters. This line is 81 characters. | 2 Lines should not be more than 80 characters. This line is 82 characters. | 3 Lines should not be more than 80 characters. This line is 83 characters. | 1 Lines should not be more than 80 characters. This line is 84 characters. | 1 Lines should not be more than 80 characters. This line is 85 characters. | 1 Lines should not be more than 80 characters. This line is 86 characters. | 2 Lines should not be more than 80 characters. This line is 87 characters. | 1 Lines should not be more than 80 characters. This line is 88 characters. | 1 Lines should not be more than 80 characters. This line is 90 characters. | 1 Lines should not be more than 80 characters. This line is 91 characters. | 1 Lines should not be more than 80 characters. This line is 92 characters. | 5 Lines should not be more than 80 characters. This line is 94 characters. | 2 Lines should not be more than 80 characters. This line is 95 characters. | 2 Lines should not be more than 80 characters. This line is 96 characters. | 1 Lines should not be more than 80 characters. This line is 98 characters. | 1

5. Other Checks

Details of other checks (click to open)

:heavy_multiplication_x: The following 4 function names are duplicated in other packages: - - `bias_corr` from bife - - `feglm` from alpaca - - `felm` from lfe - - `fixed_effects` from baggr, gratia, gravity

Package Versions

|package |version | |:--------|:--------| |pkgstats | | |pkgcheck | | |srr | |

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

jooolia commented 3 weeks ago

Dear @pachadotdev , Thank you for your submission.

You have checked "Regression and Supervised Learning" which would entail a "Statistical software review" (a different issue template to be opened, but that is ok). Would you be willing to go through the statistical software review as described here and implementing the standards:

Thanks, Julia

pachadotdev commented 3 weeks ago

Dear @pachadotdev , Thank you for your submission.

You have checked "Regression and Supervised Learning" which would entail a "Statistical software review" (a different issue template to be opened, but that is ok). Would you be willing to go through the statistical software review as described here and implementing the standards:

Thanks, Julia

thanks a lot !

indeed, when I opened the issue that was the only available template on my end

I already implemented

pachadotdev commented 1 week ago

@jooolia Hi, I cannot find the statistical review template.

jooolia commented 1 day ago

Hi @pachadotdev , sorry for the delayed response. You have the option of several different types of templates when you open an issue the statistical one is the last choice (the template is viewable here: Does this help? Thanks, Julia