ropensci / software-review

rOpenSci Software Peer Review.
286 stars 104 forks source link

concstats #559

Closed schneiderpy closed 1 year ago

schneiderpy commented 1 year ago

Date accepted: 2023-03-17

Submitting Author Name: Name Submitting Author Github Handle: !--author1-->@schneiderpy<!--end-author1-- Other Package Authors Github handles: (comma separated, delete if none) Repository: https://github.com/schneiderpy/concstats Version submitted: Submission type: Stats Badge grade: silver Editor: !--editor-->@Paula-Moraga<!--end-editor-- Reviewers: @christopherkenny, @SebastianWojcik86

Due date for @christopherkenny: 2023-02-26 Due date for @SebastianWojcik86: 2023-02-28

Archive: TBD Version accepted: TBD Language: en

Type: Package
Package: concstats
Title: Market Structure, Concentration and Inequality Measures
Version: 0.1.4
Authors@R: 
    person(given = "Andreas",
           family = "Schneider",
           role = c("aut", "cre"),
           email = "schneiderconsultingpy@gmail.com",
           comment = c(ORCID = "https://orcid.org/0000-0001-5630-1097"))
Description: Based on individual market shares of all participants in a
    market or space, the package offers a set of different structural and
    concentration measures frequently - and not so frequently - used in research
    and in practice. Measures can be calculated in groups or individually.
    The calculated measure or the resulting vector in table format should help
    practitioners make more informed decisions. Methods used in this package are
    from:
    1.  Chang, E. J., Guerra, S. M., de Souza Peñaloza, R. A. & Tabak, B. M.
    (2005) "Banking concentration: the Brazilian case".
    2.  Cobham, A. and A. Summer (2013). "Is It All About the Tails? The
    Palma Measure of Income Inequality".
    3.  García Alba Iduñate, P. (1994). "Un Indice de dominancia para el
    analisis de la estructura de los mercados".
    4.  Ginevicius, R. and S. Cirba (2009). "Additive measurement of market
    concentration" <doi:10.3846/1611-1699.2009.10.191-198>.
    5.  Herfindahl, O. C. (1950), "Concentration in the steel industry
    (PhD thesis)".
    6.  Hirschmann, A. O. (1945), "National power and structure of foreign trade".
    7.  Melnik, A., O. Shy, and R. Stenbacka (2008), "Assessing market dominance"
    <doi:10.1016/j.jebo.2008.03.010>.
    8.  Palma, J. G. (2006). "Globalizing Inequality: 'Centrifugal' and
    'Centripetal' Forces at Work".
    9.  Shannon, C. E. (1948). "A Mathematical Theory of Communication".
License: MIT + file LICENSE
URL: https://github.com/schneiderpy/concstats,
    https//schneiderpy.github.io/constats (website)
BugReports: https://github.com/schneiderpy/concstats/issues
Depends:
    R (>= 2.10)
Imports:
    dplyr (>= 1.0.7),
    readr
Suggests: 
    rmarkdown,
    knitr,
    devtools,
    kableExtra,
    ggplot2,
    testthat (>= 3.0.0),
    covr
VignetteBuilder:
    knitr,
    rmarkdown
Encoding: UTF-8
Language: en-US
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.2.1
Config/testthat/edition: 3

Scope

Pre-submission Inquiry

General Information

Some functions are already implemented in other R packages. The non-exhaustive summary below is by no means a description of each package.

However, almost none of these packages offer finite sample correction, with the exception of the ineq package. Other functions are new implementations in R, e.g. Dominance Index, Palma ratio, Stenbacka Index, GRS measure, and the dual of the Herfindahl Hirschman Index.

Badging

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

Code of conduct

ropensci-review-bot commented 1 year ago

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

ropensci-review-bot commented 1 year ago

:rocket:

Editor check started

:wave:

ropensci-review-bot commented 1 year ago

Checks for concstats (v0.1.4)

git hash: d3e10c1a

Package License: MIT + file LICENSE


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

:heavy_check_mark: All applicable standards [v0.1.0] have been documented in this package (56 complied with; 46 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.


2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate. |type |package | ncalls| |:----------|:----------|------:| |internal |base | 91| |internal |concstats | 43| |internal |stats | 5| |imports |dplyr | NA| |imports |readr | NA| |suggests |rmarkdown | NA| |suggests |knitr | NA| |suggests |devtools | NA| |suggests |kableExtra | NA| |suggests |ggplot2 | NA| |suggests |testthat | NA| |suggests |covr | NA| |linking_to |NA | NA| Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats()', and examining the 'external_calls' table.

base

as.numeric (28), sum (28), c (12), data.frame (7), cut (4), format (4), seq (4), log (2), length (1), seq_len (1)

concstats

concstats_firm (4), concstats_entropy (3), concstats_hhi (3), concstats_nrs_eq (3), concstats_palma (3), concstats_dom (2), concstats_gini (2), concstats_grs (2), concstats_hhi_d (2), concstats_hhi_min (2), concstats_simpson (2), concstats_sten (2), concstats_top (2), concstats_top3 (2), concstats_top5 (2), concstats_all_comp (1), concstats_all_inequ (1), concstats_all_mstruct (1), concstats_comp (1), concstats_concstats (1), concstats_inequ (1), concstats_mstruct (1)

stats

quantile (4), na.omit (1)

**NOTE:** No imported packages appear to have associated function calls; please ensure with author that these 'Imports' are listed appropriately.


3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has: - code in R (100% in 6 files) and - 1 authors - 2 vignettes - 1 internal data file - 2 imported packages - 22 exported functions (median 33 lines of code) - 22 non-exported functions in R (median 44 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-----:|----------:|:----------| |files_R | 6| 40.3| | |files_vignettes | 2| 85.7| | |files_tests | 5| 81.7| | |loc_R | 704| 58.2| | |loc_vignettes | 254| 58.2| | |loc_tests | 1266| 90.0| | |num_vignettes | 2| 89.2| | |data_size_total | 1301| 61.1| | |data_size_median | 1301| 65.7| | |n_fns_r | 44| 52.5| | |n_fns_r_exported | 22| 70.1| | |n_fns_r_not_exported | 22| 43.6| | |n_fns_per_file_r | 6| 71.6| | |num_params_per_fn | 2| 11.9| | |loc_per_fn_r | 38| 84.2| | |loc_per_fn_r_exp | 34| 67.2| | |loc_per_fn_r_not_exp | 44| 88.4| | |rel_whitespace_R | 17| 58.1| | |rel_whitespace_vignettes | 23| 44.1| | |rel_whitespace_tests | 14| 83.6| | |doclines_per_fn_exp | 63| 75.1| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 33| 57.5| | ---

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


4. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges [![R-CMD-check](https://github.com/schneiderpy/concstats/workflows/R-CMD-check/badge.svg)](https://github.com/schneiderpy/concstats/actions) [![pkgcheck](https://github.com/schneiderpy/concstats/workflows/pkgcheck/badge.svg)](https://github.com/schneiderpy/concstats/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 3329638517|pages build and deployment |success |d3e10c | 53|2022-10-26 | | 3329638614|pkgcheck |success |d3e10c | 25|2022-10-26 | | 3329638610|pkgdown |success |d3e10c | 64|2022-10-26 | | 3329638632|R-CMD-check |success |d3e10c | 59|2022-10-26 | | 3329638620|test-coverage |success |d3e10c | 6|2022-10-26 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) rcmdcheck found no errors, warnings, or notes #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 99.19 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- concstats_entropy | 18 concstats_gini | 18 concstats_hhi | 18 concstats_simpson | 18 concstats_sten | 17 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 8 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 8


Package Versions

|package |version | |:--------|:---------| |pkgstats |0.1.1.54 | |pkgcheck |0.1.0.26 | |srr |0.0.1.180 |


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

mpadge commented 1 year ago

@schneiderpy Our usual procedure is to first consider suitability in a pre-submission enquiry. You've opened #558 for that, so we'll close this until the conversation is resolved there. Once that is done, and if the pre-submission is approved, we can re-open this issue and move on from there. Thanks.

adamhsparks commented 1 year ago

Thanks, @schneiderpy, as @mpadge noted, usually we like to finalise the pre-submission inquiry before opening a request for a review, but everything seems to be in top shape, so I've reopened this and will start looking for a handling editor.

annakrystalli commented 1 year ago

Hello @schneiderpy 👋

Just wanted to give you a short update. We are still trying to find an appropriate editor to handle your submission and hoping to have found someone soon. Apologies for the delay!

schneiderpy commented 1 year ago

Thank you Anna for the update ... (by the way, I could not find the package in the latest newsletter (Software Review Section))

annakrystalli commented 1 year ago

Hello @schneiderpy. It's because it's still at review stage 0. It will appear in the newsletter when an editor is assigned and it reaches stage 1/editorial checks 😊

schneiderpy commented 1 year ago

Hello Anna,

this is just a short email to find out if there is a chance the my package "concstats" will get reviewed (someday)?

Best regards and an excellent 2023 Andreas Schneider

[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& Remitente notificado con Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& 20/01/23, 12:27:20

El mar, 22 nov 2022 a las 10:25, Anna Krystalli @.***>) escribió:

Hello @schneiderpy https://github.com/schneiderpy. It's because it's still at review stage 0. It will appear in the newsletter when an editor is assigned and it reaches stage 1/editorial checks 😊

— Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/559#issuecomment-1323675434, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASTA3FZ4ALF7F3LLSLPUJXDWJTCTHANCNFSM6AAAAAARQQEFRQ . You are receiving this because you were mentioned.Message ID: @.***>

annakrystalli commented 1 year ago

Dear @schneiderpy

Please accept our apologies for the delay and rest assured we are doing our best to find an appropriate editor. Hopefully we will be able to move the review forward by the end of the week.

maurolepore commented 1 year ago

Dear @schneiderpy,

Today starts my rotation as EiC -- meaning I now assume the role of @annakrystalli.

I'm watched the conversation here and among editors and it seems that the search for a handling editor continues. Thanks for your patience.

schneiderpy commented 1 year ago

Thank you Mauro

[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& Remitente notificado con Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& 01/02/23, 17:59:41

El mié, 1 feb 2023 a las 11:09, Mauro Lepore @.***>) escribió:

Dear @schneiderpy https://github.com/schneiderpy,

Today starts my rotation as EiC -- meaning I now assume the role of @annakrystalli https://github.com/annakrystalli.

I'm watched the conversation here and among editors and it seems that the search for a handling editor continues. Thanks for your patience.

— Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/559#issuecomment-1412121409, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASTA3F2NKAFJZBOBOFTEJL3WVJVCTANCNFSM6AAAAAARQQEFRQ . You are receiving this because you were mentioned.Message ID: @.***>

maurolepore commented 1 year ago

Dear @schneiderpy,

I'm pleased to announce that @Paula-Moraga will be the handling editor.

Thanks @schneiderpy for your patience, @Paula-Moraga for accepting the role, and @mpadge for connecting all of us.

maurolepore commented 1 year ago

@ropensci-review-bot assign @Paula-Moraga as editor

ropensci-review-bot commented 1 year ago

Assigned! @Paula-Moraga is now the editor

Paula-Moraga commented 1 year ago

Hi @schneiderpy, I am pleased to be the editor of this package. I will start looking for reviewers.

Paula-Moraga commented 1 year ago

@ropensci-review-bot seeking reviewers

ropensci-review-bot commented 1 year ago

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/559_status.svg)](https://github.com/ropensci/software-review/issues/559)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

schneiderpy commented 1 year ago

Dear Mauro Thank you for the good news. And of course, thank you to @Paula-Moraga and @mark padgham @.***>

Best regards

[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& Remitente notificado con Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& 02/02/23, 17:57:48

El jue, 2 feb 2023 a las 6:55, Mauro Lepore @.***>) escribió:

Dear @schneiderpy https://github.com/schneiderpy,

I'm pleased to announce that @Paula-Moraga https://github.com/Paula-Moraga will be the handling editor.

Thanks @schneiderpy https://github.com/schneiderpy for your patience, @Paula-Moraga https://github.com/Paula-Moraga for accepting the role, and @mpadge https://github.com/mpadge for connecting all of us.

— Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/559#issuecomment-1413451806, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASTA3F25WGZCL3J5XYQWBNLWVOACHANCNFSM6AAAAAARQQEFRQ . You are receiving this because you were mentioned.Message ID: @.***>

Paula-Moraga commented 1 year ago

@ropensci-review-bot assign @christopherkenny as reviewer

ropensci-review-bot commented 1 year ago

@christopherkenny added to the reviewers list. Review due date is 2023-02-26. Thanks @christopherkenny for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 year ago

@christopherkenny: If you haven't done so, please fill this form for us to update our reviewers records.

Paula-Moraga commented 1 year ago

@ropensci-review-bot assign @SebastianWojcik86 as reviewer

ropensci-review-bot commented 1 year ago

@SebastianWojcik86 added to the reviewers list. Review due date is 2023-02-28. Thanks @SebastianWojcik86 for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

ropensci-review-bot commented 1 year ago

@SebastianWojcik86: If you haven't done so, please fill this form for us to update our reviewers records.

SebastianWojcik86 commented 1 year ago

Dear all. It is my first review. I wanted to do it in an exhaustive way and hope it shall be valuable for the Author.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 12


Review Comments

General Standards for Statistical Software

G1.0 Statistical Software should list at least one primary reference from published academic literature.

In the list of references in the DESCRIPTION file I have not found Stenbacka index and Simpson. Please indicate where I can find it.

G1.1 Statistical Software should document whether the algorithm(s) it implements are...

Not stated in README file. I guess it should be "The first implementation within R of an algorithm which has previously been implemented in other languages or contexts". I have seen a similar package in python.

All other standards are met or irrelevant for the package.

Exploratory Data Analysis and Summary Statistics

All standards are met or irrelevant for the package.

Methodology

An argument unbiased explanation as 'Logical. Argument specifying whether or not a finite sample correction should be applied. The default is FALSE.' is misleading. Some measures can be normalized since their value - under equal shares assumption - depends on the number of market participants. Please correct the description of argument unbiased towards idea of normalization of the measure.

There is an inconsistency between required form of 'x' vector. For entropy, shares are calculated from 'x' while for HH index, input is in a form of shares. It should be unified.

Correct me, if I am wrong. Simpson index is defined in the package as a complement of HH index, that is $S=1-HH$. Under equal shares assumption it gives $S=1-\frac{1}{n}$. Following normalization of HH index, normalized Simpson index should be of the form $$S_{norm}=\frac{S-1+\frac{1}{n}}{1-\frac{1}{n}}=\frac{\frac{1}{n}-HH}{1-\frac{1}{n}}.$$ It is not in a line with definition of normalized Simpson index in the package. Check simple calculations

library(concstats)
# a vector of market shares
share <- c(0.35, 0.4, 0.05, 0.1, 0.06, 0.04)
# raw Simpson index
simpson(share)
# normalized Simpson index from the package
simpson(share,unbiased = T)
# normalized Simpson index? Make it sense?
(simpson(share)-(1-1/length(share)))/(1-1/length(share))

Function grs is implemented in a slightly different way then the way indicated in the reference. GRS does not to be restricted to positive shares. And so, sum(x > 0) should be replaced by length(x) or something similar. Such replacement has an impact on the result.

Function dom has no normalization option. Correctly?

Man files

Vignettes

In a Vignette 'Introduction to Concstats'

schneiderpy commented 1 year ago

Thank you @SebastianWojcik86 for reviewing the package. I have already seen some useful comments and will implement these as soon as I can.

I have added the Ref to the README file.

Best regards

[image: Mailtrack] https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& Remitente notificado con Mailtrack https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11& 08/02/23, 19:51:10

El mar, 7 feb 2023 a las 10:12, SebastianWojcik86 @.***>) escribió:

Dear all. It is my first review. I wanted to do it in an exhaustive way and hope it shall be valuable for the Author. Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest https://devguide.ropensci.org/policies.html#coi for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via @.***).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 12

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments General Standards for Statistical Software

G1.0 Statistical Software should list at least one primary reference from published academic literature.

In the list of references in the DESCRIPTION file I have not found Stenbacka index and Simpson. Please indicate where I can find it.

G1.1 Statistical Software should document whether the algorithm(s) it implements are...

Not stated in README file. I guess it should be "The first implementation within R of an algorithm which has previously been implemented in other languages or contexts". I have seen a similar package in python.

All other standards are met or irrelevant for the package. Exploratory Data Analysis and Summary Statistics

All standards are met or irrelevant for the package. Methodology

An argument unbiased explanation as 'Logical. Argument specifying whether or not a finite sample correction should be applied. The default is FALSE.' is misleading. Some measures can be normalized since their value - under equal shares assumption - depends on the number of market participants. Please correct the description of argument unbiased towards idea of normalization of the measure.

There is an inconsistency between required form of 'x' vector. For entropy, shares are calculated from 'x' while for HH index, input is in a form of shares. It should be unified.

Correct me, if I am wrong. Simpson index is defined in the package as a complement of HH index, that is $S=1-HH$. Under equal shares assumption it gives $S=1-\frac{1}{n}$. Following normalization of HH index, normalized Simpson index should be of the form

$$S_{norm}=\frac{S-1+\frac{1}{n}}{1-\frac{1}{n}}=\frac{\frac{1}{n}-HH}{1-\frac{1}{n}}.$$ It is not in a line with definition of normalized Simpson index in the package. Check simple calculations

library(concstats)

a vector of market shares

share <- c(0.35, 0.4, 0.05, 0.1, 0.06, 0.04)

raw Simpson index

simpson(share)

normalized Simpson index from the package

simpson(share,unbiased = T)

normalized Simpson index? Make it sense?

(simpson(share)-(1-1/length(share)))/(1-1/length(share))

Function grs is implemented in a slightly different way then the way indicated in the reference. GRS does not to be restricted to positive shares. And so, sum(x > 0) should be replaced by length(x) or something similar. Such replacement has an impact on the result.

Function dom has no normalization option. Correctly? Man files

  • no citations included
  • for Stenbacka index, I have not found a relevant reference. Is it defined just for two biggest market shares?. I guess that one is fine: Melnik, A., Shy, Oz, Stenbacka, R., (2008), “Assessing market dominance”, Journal of Economic Behavior and Organization 68, pp. 63-72.
  • the most of measures can be described briefly with single formula and understood by majority package users. I encourage the Author to enrich help pages with such description using TEX e.g. eqn and deqn environments. It looks nice :) $$HHI(x)=\sum{i=1}^{n}\left(\frac{x{i}}{\sum{i=1}^{n}x{i}}\right)^2$$
  • description 'computes the numbers equivalent' of the function nrs_eq is very unclear
  • in my opinion examples should be written to print outputs and informative, that is

library(concstats)

a vector of market shares

share <- c(0.35, 0.4, 0.05, 0.1, 0.06, 0.04)

the number of firms with market share

mstruct(share, type = "firm")

Calculate top market share individually

top(share)

Calculate the market structure group measures

mstruct(share, type = "all")

instead

library(concstats)

a vector of market shares

share <- c(0.35, 0.4, 0.05, 0.1, 0.06, 0.04)

the number of firms with market share

share_firm <- mstruct(share, type = "firm")

Calculate top market share individually

share_top <- top(share)

Calculate the market structure group measures

share_mstruct <- mstruct(share, type = "all")

Vignettes

In a Vignette 'Introduction to Concstats'

  • 'Installation' paragraph is redundant since the vignette 'Getting started' covers that issue
  • there is no code producing a plot titled 'Credit cooperatives (type) A'.

— Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/559#issuecomment-1420752820, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASTA3F2EPN4IVMHRCKMH45DWWJC3BANCNFSM6AAAAAARQQEFRQ . You are receiving this because you were mentioned.Message ID: @.***>

schneiderpy commented 1 year ago

G1.0 Statistical Software should list at least one primary reference from published academic literature.

In the list of references in the DESCRIPTION file I have not found Stenbacka index and Simpson. Please indicate where I can find it.

The following REF has already been in the DESCRIPTION file. Melnik, A., Shy, Oz, Stenbacka, R., (2008), “Assessing market dominance”, Journal of Economic Behavior and Organization 68, pp. 63-72. I have just added Simpson

G1.1 Statistical Software should document whether the algorithm(s) it implements are...

Not stated in README file. I guess it should be "The first implementation within R of an algorithm which has previously been implemented in other languages or contexts". I have seen a similar package in python.

I need some advise on this. The Guide states that it should be documented. The README file has a "Prior Art" section. Some functions (e.g. HHI) are already implemented in other packages. For others (e.g. dual of HHI, Stenbacka index .. ) it is a new implementation in R. Since this is a R package I honestly did not check Python, but I also came accross the "concentrationMetrics" package in Python. Therefore I have added the following: "Some popular measures, e.g. Gini or the Herfindahl-Hirschman index have also been implemented in Python."

An argument unbiased explanation as 'Logical. Argument specifying whether or not a finite sample correction should be applied. The default is FALSE.' is misleading. Some measures can be normalized since their value - under equal shares assumption - depends on the number of market participants. Please correct the description of argument unbiased towards idea of normalization of the measure.

I have changed all "unbiased" to normalized now. "specifying whether or not a normalized value is

' required. Ranges from {0, 1} and often used for comparison over time. Must

' be either TRUE or FALSE."

There is an inconsistency between required form of 'x' vector. For entropy, shares are calculated from 'x' while for HH index, input is in a form of shares. It should be unified.

I have unified the required form to "x".

Correct me, if I am wrong. Simpson index is defined in the package as a complement of HH index, that is . Under equal shares assumption it gives. Following normalization of HH index, normalized Simpson index should be of the form ......

I need to check this. Furthermore, it should say "Gini-Simpson index"

Function grs is implemented in a slightly different way then the way indicated in the reference. GRS does not to be restricted to positive shares. And so, sum(x > 0) should be replaced by length(x) or something similar. Such replacement has an impact on the result.

I have to check this too. However, in this context, negative shares make no sense ...

Function dom has no normalization option. Correctly?

This is correct. The Dominnce index has no normalization option (removed this from the script). However, in theory, it is possible, since the dom index is based on the HHI.

Man files no citations included

All man files have now citations and references.

for Stenbacka index, I have not found a relevant reference. Is it defined just for two biggest market shares?. I guess that one is fine: Melnik, A., Shy, Oz, Stenbacka, R., (2008), “Assessing market dominance”, Journal of Economic Behavior and Organization 68, pp. 63-72.

That's correct.

he most of measures can be described briefly with single formula and understood by majority package users. I encourage the Author to enrich help pages with such description using TEX e.g. eqn and deqn environments. It looks nice :)

I need some advise on this point. I have "checked" some other packages, in particular stats packages. However, it seems that including formulae is not "best practice". Probably due to the references where one can "dive" deeper. I have no problem to include some well known formula, e.g. HHI index and others.

description 'computes the numbers equivalent' of the function nrs_eq is very unclear.

Changed to: "\code{concstats_nrs_eq} computes the reciprocal of the HHI, which indicates the equivalent number of firms of the same size"

in my opinion examples should be written to print outputs and informative, that is ... changed to this form (the results are not stored in a named object) library(concstats)

a vector of market shares

share <- c(0.35, 0.4, 0.05, 0.1, 0.06, 0.04)

the number of firms with market share

mstruct(share, type = "firm")

Calculate top market share individually

top(share)

Calculate the market structure group measures

mstruct(share, type = "all")

Vignettes In a Vignette 'Introduction to Concstats'

'Installation' paragraph is redundant since the vignette 'Getting started' covers that issue

That is true. I have removed the "installation" from the "Intro" vignette, and kept it in the "Getting started" vignette.

here is no code producing a plot titled 'Credit cooperatives (type) A'.

The vignette now shows the code of the plot.

SebastianWojcik86 commented 1 year ago

@schneiderpy implemented a majority of my suggests, minority of my suggests is under consideration or revision. I am very glad that my review was taken into account in such scope.

schneiderpy commented 1 year ago

Correct me, if I am wrong. Simpson index is defined in the package as a complement of HH index, that is

. Under equal shares assumption it gives. Following normalization of HH index, normalized Simpson index should be of the form ......

I need to check this. Furthermore, it should say "Gini-Simpson index"

This issue has been addressed now using the "Gini-Simpson index" also known as Gini's diversity index, or Gini impurity in ML. Since this index ranges from {0, 1} there is no normalization option included.

schneiderpy commented 1 year ago

Hello @SebastianWojcik86

Function grs is implemented in a slightly different way then the way indicated in the reference. GRS does not to be restricted to positive shares. And so, sum(x > 0) should be replaced by length(x) or something similar. Such replacement has an impact on the result.

The equation for the "grs measure" has been updated to grs <- as.numeric(sum((length(x) ^ 2 x[1] + 0.3 x ^ 2) / (length(x) ^ 2 + length(x) 0.3 x[1] x) x)) This is to get the implementation of the equation right. However, the implementation of length(x) has no impact on the results, since, in this context, negative shares make no sense, and therefore the input is still restricted to x > 0.

With that said, I have replied to all comments of the reviewer @SebastianWojcik86. Most of them have been implemented to improve the package. Again, thank you @SebastianWojcik86

ropensci-review-bot commented 1 year ago

:calendar: @christopherkenny you have 2 days left before the due date for your review (2023-02-26).

SebastianWojcik86 commented 1 year ago

Thank you @schneiderpy for taking my suggestions into account. I fully accept a current version of the package.

christopherkenny commented 1 year ago

Thank you for the opportunity to review this package. It is well designed and fairly comprehensive. I have many small suggestions and requests for clarification within documentation are detailed below. More importantly, I am concerned about the state of the tests used. Substantial revision is necessary to the testthat tests to ensure that the tests test functions from the package.

Package Review


Compliance with Standards

The following standards currently deemed non-applicable (through tags of @srrstatsNA) could potentially be applied to future versions of this software: None.

Please also comment on any standards which you consider either particularly well, or insufficiently, documented.

For packages aiming for silver or gold badges:

Absent test revision, the standard needed for silver appears to be already met, from my reading of the instructions.


General Review

There seem to be two sets of reviewer instructions, one in the devguide and one in the linked instructions from this issue. I first go through the check-boxes from the devguide and then more extensive comments follow using the linked instructions.

Documentation (devguide)

The package includes all the following forms of documentation:

Functionality (devguide)

Documentation

The package includes all the following forms of documentation:

Algorithms

Testing

Package Design


Estimated hours spent reviewing: 5

schneiderpy commented 1 year ago

Thank you @christopherkenny for reviewing this package with a lot of new input and (still) things to do. Some of these issues already poped up earlier (e.g. dependencies) others are new, and probably due to inexperience. Nevertheless, I'll do my best to address your issues and requests as soon as possible.

ropensci-review-bot commented 1 year ago

:calendar: @SebastianWojcik86 you have 2 days left before the due date for your review (2023-02-28).

Paula-Moraga commented 1 year ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/559#issuecomment-1420752820 time 12

ropensci-review-bot commented 1 year ago

Logged review for SebastianWojcik86 (hours: 12)

Paula-Moraga commented 1 year ago

@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/559#issuecomment-1445204953 time 5

ropensci-review-bot commented 1 year ago

Logged review for christopherkenny (hours: 5)

Paula-Moraga commented 1 year ago

Many thanks @SebastianWojcik86 and @christopherkenny for your time and effort to review the package!

Many thanks @schneiderpy for addressing the comments by @SebastianWojcik86. Looking forward to see the new version implementing the suggestions by @christopherkenny.

ropensci-review-bot commented 1 year ago

@schneiderpy: please post your response with @ropensci-review-bot submit response <url to issue comment> if you haven't done so already (this is an automatic reminder).

Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html

schneiderpy commented 1 year ago

@christopherkenny, apologize for the delay.

Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine. Detailed issues with this are explained below. All tests do pass, but some of the tests do not actually test the package.

Response: That was true. for more details see below.

Community guidelines including contribution guidelines in the README or CONTRIBUTING

It may be worth linking to CONTRIBUTING.md from README.md, as you do mention development briefly in the readme.

Response: There is now a link from README to CONTRIBUTING.md

Throughout the package (documentation, readme, and vignettes), you opt not to document functions following standard practices as fn(), but instead as fn. This is a fine discretionary choice and does not appear to be part of rOpenSci's guidelines for review. However, you do use pkgdown and roxygen2, which makes this a surprising choice, as the functions do not link to the documentation when you do not use fn() notation. It would be useful for users of the package if you allow for the automatic links to documentation to generate by using fn() over fn.

Response: I have changed this according to standard practice. All fs() have now links in the documentation files to make use of the pkgdown and roxygen2 functionality to create autmatic links.

You often make use of @rdname for related families of functions with overlapping applications. This looks well done, but there is a bit of repetition here. You re-document several @params on the same @rdname. For consistency here, the duplicated @params should likely be removed or the @rdname tag should be removed in favor of separate documentation pages

Response: I have considered your suggestion. All @rdname tags have now the name acording to the function, that is, all functions have now their own documentation page. I did this is favor of more readability and it gives me more flexability for future updates (e.g. one script for each function).

In concstats_comp() and concstats_inequ(), normalize should be further documented and potentially checked.

If normalize = NA, it gives a generic R warning. It seems that this should be checked along with the logical and length checks, as you are already checking those and providing a directly helpful error message. Further, normalize is summarily ignored if type = 'all'. This may be intended behavior, but should be documented or checked for. Otherwise, it confidently gives the wrong (or at least unexpected) answer when normalize = TRUE for the gini and entropy measures compared to when normalize = TRUE and type %in% c('gini', 'entropy') or type %in% c('hhi')

Response: Included checks/tests for normalized = NA. Functions like `concstatsall* and functions with type = "all" have now a normalized = normalized argument

Line 40 of R/comp.R references a function (concstats_all()) which does not exist. This should be revised to point to (presumably)concstats_all_comp().

Line 39 of R/inequ.R references a function (concstats_all()) which does not exist. This should be revised to point to (presumably)concstats_all_inequ().

Line 27 of R/mstruct.R references a function (concstats_all()) which does not exist. This should be revised to point to (presumably)concstats_all_mstruct().

Response: functions have been referenced as suggested, e.g. concstats_all_comp(), concstats_all_inequ(), concstats_all_mstruct

Documentation for concstats_concstats() would benefit from the list of measures that will be computed. It indicates "a set of different and selected structural, inequality, and concentration measures" will be computed, but indicating which those are would be useful to users.

Response: All measures included in this set are now mentioned in the documentation/details

x is documented as "@param x a numeric vector of length 1 with non-negative values." throughout R/comp.R, R/concstats.R, R/inequ.R, and R/mstruct.R. This seems incorrect and conflicts with the examples, which always show a numeric vector of positive length.

Response: This has been changed to "x A non-negative numeric vector."

Input classes and lengths are consistently checked. One point of question: When checking x, what is the intended behavior if length(x) == 1? In some cases, this works. In others, such as for concstats_grs(), it fails with an uninformative error.

Response: This has been removed/changed to sum(x) == 1

concstats_constats(), concstats_all_comp(), concstats_all_inequ(), and concstats_all_mstruct() each round the output to 2 decimal digits. This may be a reasonable choice, but it is not clear why 2 decimal digits is appropriate here, but not in the other functions. Should all outputs contain a rounding step or is this step unnecessary? One way to make this more consistent could be to make this optional, controlled by a new argument.

Response: All functions who return a table of the group functions now have an additional argument with default "digits = NULL" as an intermediary between precision and style.

In each of concstats_all_comp(), concstats_all_inequ(), and concstats_all_mstruct(), there is code which runs via invisible(utils::capture.output(...)). It is not clear what the purpose of this is, as there does not appear to be any print statements in the functions in the .... If this is intended to capture some known issue with printing elsewhere, it would seem best to deal with this directly by modifying inputs or providing warnings or errors where appropriate.

Response: Advise on this will be apreciated. The intencion is to compute the individual functions but not return the results one by one, instead "collect" them and return them all in table. I haven't found a better way to code this yet.

However, a concerning number of the tests in the package are inappropriate for testing the package. Hundreds of the tests are testing basic aspects of base R, not testing pieces of the package. Things like testing R's assignment operator, that NAs are NA, or that vectors sum properly. These tests should be removed.

Response: I do agree with respect to NAs, however, I kept tests to check if the vectors sums up to (near) 1, since, in my opinion, this is an essential characteristic of the vector. Probably you do agree that a relative market share of 90% or 130% make no sense. If I don't control for the input the functions will work, but the results will be erronous (due to copy-paste or typing errors). Furthermore, sometimes the user of the package is not the same person who prepared the raw data.

In many calls to expecterror() across all test files it appears that the parentheses are frequently not matched properly, so that what appears to be an argument for regexp is instead passed to isTRUE() or to concstats*(). That will always cause an error, so the test is again not testing aspects of the package, but instead that a base R function performs how it is intended. In many cases, this means that tests designated for particular standards with "@srrstats {G5.2, G5.2a, G5.2b, G5.8, G5.8a, G5.8b}" (e.g. in tests/testthat/test_comp.R l. 64-68) do not appear to test anything that would satisfy those standards for concstats. This is a pervasive problem in the tests that would need substantial revision.

Response: I have reviewed expect_error() (and other expectations) and the parentheses. Some of these were definitly wrong. Some have now this structure: act <- concstats_hhi(x) exp <- concstats_hhi(x4 / sum(x4)) expect_equal(act, exp, tolerance = .Machine$double.eps^0.25)

Others: expect_error(concstats_hhi(sum(x1), 1, tolerance = .Machine$double.eps^0.25)) to satisfy tests and "@srrstats" standards. In addition, in future updates each function will have its own file and tests (as recommended standard) with the intent to make this more readable and make testing easier.

In data-raw/creditcoops.R, on line 3 n() should be dplyr::n(), otherwise the demo data cannot be reproduced (assuming no packages loaded beyond devtools::load_all()).

Response: has been changed to dplyr::n()

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Response: Of course, both reviewers have been added to the DESCRIPTION file

christopherkenny commented 1 year ago

Hi @schneiderpy, thanks for this. The changes look good to me. Thank you for your attention to the details here.

Re:

Response: Advise on this will be apreciated. The intencion is to compute the individual functions but not return the results one by one, instead "collect" them and return them all in table. I haven't found a better way to code this yet.

It would seem to me that you can just remove the invisible(utils::capture.output(...)) (and inner commas) as they will not return anything to the global environment. You control their return through the table later.

Some extra notes that do not need to be addressed, but may be helpful for future maintenance of the package:

schneiderpy commented 1 year ago

Hi @christopherkenny and thank you for the advise. I have already tested it and it works fine. I will consider the extra notes for future updates. Again, thank you for reviewing the package and the valuable comments and advise.

Paula-Moraga commented 1 year ago

Many thanks @schneiderpy, @christopherkenny and @SebastianWojcik86 for your time and work to improve the package. I am very pleased to approve it!

Paula-Moraga commented 1 year ago

@ropensci-review-bot approve concstats

ropensci-review-bot commented 1 year ago

Approved! Thanks @schneiderpy for submitting and @christopherkenny, @SebastianWojcik86 for your reviews! :grin:

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

schneiderpy commented 1 year ago

Thank you @Paula-Moraga for the good news! I could not find any useful instruction on how to "transfer the repo to rOpenSci's "ropensci" GitHub organization". Any advise would be apreciated to continue the process.

schneiderpy commented 1 year ago

@ropensci-review-bot finalize transfer of concstats