Closed whedon closed 4 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @JonasMoss, @steffilazerte it looks like you're currently assigned to review this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
👋 @JonasMoss @steffilazerte Thank you for agreeing to review this submission! Whedon generated a checklist and linked a reviewer guide above -- let me know if you have any questions.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Hi @KaiWenger, Hi @Janis410!
I have completed my first round of reviews and have some comments/suggestions for you. To make things clear, I've included a check box where I need feedback or action from you. This R package is solid and and well documented. I found the review straightforward and the examples easy to follow. Thanks!
Authorship
Functionality I installed the CRAN version. I had to install some other libraries first, but this kind of thing is common for my OS (Linux) and was easily fixed. The installation went smoothly and I ran into no problems testing the package.
Documentation
I was thoroughly impressed by the amount of documentation that has gone into the functions, especially with respect to referencing different techniques. This is an area that is often skipped or only given lip service and the level of detail here makes the memochange
package so much more usable!
memochange.R
and annotating the examples (KaiWenger/memochange#3)Software Paper I am a biologist and R programmer familiar with statistical analyses in general, but not with time series analyses nor economics/finance. I was pleasantly surprised to find that while I found the summary complex, it was still mostly understandable (although don't ask me to explain any of the details!). However, it is a complex write up and I'm curious what @JonasMoss thought.
code font
(i.e., use the ticks `)[see @robinson1995gaussian, ...]
(I think this should remove the double brackets around citations)Don't hesitate to let me know if anything is unclear or you would like more details.
DISCLAIMER: @mbobra I know nothing of the particular analyses being performed here. My review consists of reviewing whether the package runs into errors, has documentation, examples, and compiles correctly. I am able to reproduce the examples, but am unable to assess whether the results returned are, in fact, statistically sound/correct.
Hi @KaiWenger, @Janis410, and @steffilazerte!
First of all congratulations on a well-designed package! There are a few issues but all in all, this looks like a very solid piece of software.
Note: I'm finished looking at the code but haven't read through the paper proper yet. I'll do that tomorrow and probably post a new issue on your Github then.
I second all of @steffilazerte's suggestions. The most important is the addition of automated tests https://github.com/KaiWenger/memochange/issues/6 (or, at the very least some kind of testing). I also believe your imports have to be fixed, see https://github.com/KaiWenger/memochange/issues/4, and its companion problem of confusing imports in examples https://github.com/KaiWenger/memochange/issues/5.
Most of the documentation issues in https://github.com/KaiWenger/memochange/issues/7 and https://github.com/KaiWenger/memochange/issues/8 are pretty minor and should be easy to fix. I do have some suggestions here I believe would make your package better, but it's fine if you don't follow through on, such as making S3 objects for the return values of e.g CUSUMfixed
. These could be modeled after the objects in base R
, such as stats::t.test
. Such objects come in handy for many purposes, such as for extending your API, for adding generics such as plotting, for helping the user familiar with R
, et cetera. Still, I will absolutely respect your choice if you decide not to implement them.
The issues with the high-level documentation of the package are more serious than the issues with the nitty-gritty, at least in terms of attracting users! I think it takes a little too much effort to get a good understanding of the why of your package. It's easy to understand the what of your package; it can be used to test for changes in persistence and means. Still, I have a hard time understanding exactly which function I should pick if I ever want to do a test for a change in persistence, even after reviewing your package. I understand that this is too difficult to address directly in the package docs (I imagine it's an active field of research in its own right), but that is why I suggest you give a short overview of all your functions and compare them in a vignette. It would also be nice to have more \seealso{}
in the documentation. This would make it easier to browse the docs for the user. https://github.com/KaiWenger/memochange/issues/9.
As I alluded to, the lack of tests is a major issue. As the package now stands, I don't know if it does what it is supposed to. Ideally, the package would have two kinds of tests:
testthat
,The first one is most important as it highly increases the trust in the software.
I'll be following both my and @steffilazerte issues. If you have any questions or think I'm just horribly wrong about something, don't hesitate to talk there. Feel free to modify the issues to make e.g. checklists.
Agreed :)
I particularly like the suggestion:
- Tests replicated trusted results from scientific publications.
If examples of this sort are included, it will help demonstrate to users that the package does what it is supposed to do. If tests of this sort are included, it will help ensure that future changes don't break any existing functionality.
Hi again @KaiWenger, @Janis410!
Here is my paper review.
I think the paper is both clear and informative. It gives an understandable and short overview of the field of research and makes it easy to comprehend what the package does. It is written for a specialist audience though; it reads like a literature review at the beginning of a statistics article. (I would not able to review its quality as a literature review, as I am no specialist in time series analysis and know nothing about this specific area of time series analysis.)
Take a look at the review criteria, which states that the paper should include:
[A] summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience
Though I highly appreciate your paper (and I absolutely think you shouldn't delete it, you could include it with the package as a vignette for instance) I don't think the paper satisfies the point above.
Some suggestions for writing a paper more friendly to non-specialists is to:
As @steffilazerte mentions, you should write something about the existence or non-existence of other packages dealing with similar problems and how this package is different from them. I imagine this would include references to packages testing from breakpoints in stationary time series, for instance.
These are the comments I have with your summary. Please keep in mind that the paper must be short, so when I say that a point is unclear it might be better to remove it from the paper entirely than trying to explain it.
For an I(d), or long-memory, process with 0 < d < 1, shocks neither die out quickly nor persist infinitely
Put the "0 < d < 1" in front of I(d) or rewrite, i.e if 0 < d < 1 then an I(d) process is a long-memory process. I suggest you drop the terms long-memory, short-memory and difference-stationary as they not used again in the paper and are unlikely to help beginners.
Busetti & Taylor (2004) and Leybourne & Taylor (2004) suggest approaches for testing the null of constant I(0)
You have forgotten to use Latex on I(0).
However, both approaches show serious distortions if neither the null nor the alternative is true, i.e. the series is constant I(1).
This is a surprising "i.e.", as the series could have been I(d) for some d in (0,1): There is no earlier indication in the text that I(0)/I(1) are the only possible alternatives. Consider a rewrite or the use of "e.g.".
Under the null the test assumes constant I(d) behavior with 0 ≤ d < 3/2. The approach suggested by Martins & Rodrigues (2014) is even able to identify changes from −1/2 < d1 < 2 to −1/2 < d2 < 2 with d1 ̸= d2. H
Why would you use or care about other tests then? Does Martins & Rodrigues (2014)'s test have horrible power?
Structural changes cannot only occur in the persistence, but also in the mean of a persistent time series.
I suggest you rewrite this, as it can be read as "structural changes can never only occur in the persistence [...]".
However, these tests for a mean shift are invalidated for d > 0. The problem is that the limiting distributions of the standard tests are different under long memory. Therefore, they reject the null hypothesis of a constant mean with probability of one asymptotically.
Please change "probability of one " to "probability one".
I suggest you remove "invalidated" and write inconsistent instead, as the notion of consistent tests is more well-known and unique than "valid". You could also go with "the tests do not work when d>0" as the summary is intended for a non-specialist audience.
The part with "Therefore, they reject the null hypothesis of a constant mean with probability of one asymptotically." sounds fishy. I understand that limiting distributions can change from context to context, but that does not seem to imply the "therefore" above. For instance, a change from N(0,1) to N(0,1-\epsilon) would not have your claimed property. It's also conceivable the null hypothesis would be accepted with probability 1 in the limit. (I am talking in generality here; i.e all testing problems).
A large body of literature handles this topic for weakly dependent I(0) time series.
Could you describe the R
-packages made for this problem?
First, whether they are adapted on the CUSUM, sup-Wald, or Wilcoxon test. Second, which type of long-run variance estimator they apply in their test statistics. Early approaches utilize consistent estimates of the long-run variance (Horváth & Kokoszka (1997), Wang (2008), Dehling, Rooch, & Taqqu (2013)), while more recent contributions suggest self-normalized test statistics (Shao (2011), Iacone, Leybourne, & Taylor (2014), Betken (2016), Wenger & Leschinski (2019)). This may be the reason since it is found in Wenger et al. (2019) that self-normalized tests are robust against size distortions.
I believe this paragraph is far too technical for this paper:
First, whether they are adapted on the CUSUM, sup-Wald, or Wilcoxon test.
What are these?
Early approaches utilize consistent estimates of the long-run variance (Horváth & Kokoszka (1997), Wang (2008), Dehling, Rooch, & Taqqu (2013)), while more recent contributions suggest self-normalized test statistics (Shao (2011), Iacone, Leybourne, & Taylor (2014), Betken (2016), Wenger & Leschinski (2019)).
When you write "Early approaches utilize consistent estimates of the long-run variance " does that mean modern approaches use inconsistent estimates? What are self-normalized test statistics?
This may be the reason since it is found in Wenger et al. (2019) that self-normalized tests are robust against size distortions.
What is size distortions?
Change "Robinson, P.M." and others to "Robinson, P.M." In "Robinson, P. M., & others. (1995). Gaussian semiparametric estimation of long range dependence."
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Hi @steffilazerte and @JonasMoss,
first of all: thank you very much for your helpful and constructive thoughts and suggestions! We carefully considered all of your suggestions and think that our package improved significantly. We commented on all of your issues in #1, #2, #3, #4, #5, #6, #7, #8, and #9. In summary, we incorporated your remarks as follows:
Statement_of_Author_Contributions.md
.package::function()
notation,
(b) we be more precise in explaining the arguments and return values of all functions,
(c) we explain the function's examples in more detail, and
(d) we changed the DESCRIPTION
and .Rbuildignore
files according to your suggestions.testthat
tests for all of our functions. We test for the most reasonable combinations of use cases and it is checked whether the implemeted tests produce reasonable results. Second, we reproduce trustable results from published scientific papers in the files /testing/Simulation_Change_in_Mean
and /testing/Simulation_Break_in_Persistence.Rmd.
Precisely, we reproduce Monte Carlo simulation studies as well as simulations of critical values of the papers that published the tests we implement. We think that both tests increase the trust in our software significantly (see also #3, #6).CONTRIBUTING.md
file to make clear that software contributions are welcome (#3).README
: we shortened it to be short and on the point, using little code and not much text. Furthermore, for the change-in-mean tests we include a simple real data example (#2, #3, #9).README
and paper.md
in there, but also explained the functions and tests in much more detail and in many cases in an easier way (#3, #9).paper.md
to explain the functionality and purpose of our package "for a diverse, non-specialist audience" as stated in the review criteria of JOSS. As already mentioned, we use many parts of the former paper.md
in our new vignettes. The new software paper is written in a much better way for a non-specialist audience. It further includes a comment on other packages for change in mean and change in persistence. We also elaborate on the statement of need at the end of the paper and shortened the paper. Since the link to our github repository is in the header of the software paper, we did not include a link once again at the bottom of the paper.We hope that both of you are satisfied with our modifications. We really put a lot of effort the last weeks into the improvement of our package according to your suggestions. We are looking forward hearing from you :)
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@KaiWenger
Your response to our programming comments is great! I left two comments on the Github page that are non-critical.
I really like your new summary paper too. The only issue I can see is that you should cite the strucchange
package. Add that and I can check off the last box.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@KaiWenger Please use the Journal of Statistical Software entry from https://cran.r-project.org/web/packages/strucchange/citation.html
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@JonasMoss Thanks for your comments! We included the citation for the strucchange package in our software paper and treated your last two comments on our Github page :)
@KaiWenger Great! Just add the DOI to the paper. =) The DOI is 10.18637/jss.v007.i02
👋 @KaiWenger could you tick off the boxes that you've completed in https://github.com/openjournals/joss-reviews/issues/1820#issuecomment-547599162? Thanks :)
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Hi @KaiWenger, @Janis410, Wow! So much work in such a short time. Adding the tests, vignettes and annotating the examples makes this a very solid package.
I also like the re-write of the article, it's much more accessible and is very clear about it's purpose.
I have a couple of editorial (and occasionally nit-picky) comments for the article and one comment for the repository:
strucchange
package (ref), which also identifies changes in mean. However, the procedures implemented in the memochange package allow for valid inference in a more general setting."My final suggestion is up to you:
A really great solid package, thanks!
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Submitting author: @KaiWenger (Kai Wenger) Repository: https://github.com/KaiWenger/memochange Version: 1.1.0 Editor: @mbobra Reviewer: @JonasMoss, @steffilazerte Archive: 10.5281/zenodo.3543826
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@JonasMoss & @steffilazerte, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @mbobra know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @JonasMoss
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @steffilazerte
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper