openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
717 stars 38 forks source link

[REVIEW]: memochange: An R package for estimation procedures and tests for persistent time series #1820

Closed whedon closed 4 years ago

whedon commented 4 years ago

Submitting author: @KaiWenger (Kai Wenger) Repository: https://github.com/KaiWenger/memochange Version: 1.1.0 Editor: @mbobra Reviewer: @JonasMoss, @steffilazerte Archive: 10.5281/zenodo.3543826

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/2a38b620910689bbdb288e11dc810f06"><img src="https://joss.theoj.org/papers/2a38b620910689bbdb288e11dc810f06/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/2a38b620910689bbdb288e11dc810f06/status.svg)](https://joss.theoj.org/papers/2a38b620910689bbdb288e11dc810f06)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@JonasMoss & @steffilazerte, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @mbobra know.

Please try and complete your review in the next two weeks

Review checklist for @JonasMoss

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @steffilazerte

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 4 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @JonasMoss, @steffilazerte it looks like you're currently assigned to review this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

mbobra commented 4 years ago

👋 @JonasMoss @steffilazerte Thank you for agreeing to review this submission! Whedon generated a checklist and linked a reviewer guide above -- let me know if you have any questions.

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

steffilazerte commented 4 years ago

Hi @KaiWenger, Hi @Janis410!

I have completed my first round of reviews and have some comments/suggestions for you. To make things clear, I've included a check box where I need feedback or action from you. This R package is solid and and well documented. I found the review straightforward and the examples easy to follow. Thanks!

Authorship

Functionality I installed the CRAN version. I had to install some other libraries first, but this kind of thing is common for my OS (Linux) and was easily fixed. The installation went smoothly and I ran into no problems testing the package.

Documentation I was thoroughly impressed by the amount of documentation that has gone into the functions, especially with respect to referencing different techniques. This is an area that is often skipped or only given lip service and the level of detail here makes the memochange package so much more usable!

Software Paper I am a biologist and R programmer familiar with statistical analyses in general, but not with time series analyses nor economics/finance. I was pleasantly surprised to find that while I found the summary complex, it was still mostly understandable (although don't ask me to explain any of the details!). However, it is a complex write up and I'm curious what @JonasMoss thought.

Don't hesitate to let me know if anything is unclear or you would like more details.


DISCLAIMER: @mbobra I know nothing of the particular analyses being performed here. My review consists of reviewing whether the package runs into errors, has documentation, examples, and compiles correctly. I am able to reproduce the examples, but am unable to assess whether the results returned are, in fact, statistically sound/correct.

JonasMoss commented 4 years ago

Hi @KaiWenger, @Janis410, and @steffilazerte!

First of all congratulations on a well-designed package! There are a few issues but all in all, this looks like a very solid piece of software.

Note: I'm finished looking at the code but haven't read through the paper proper yet. I'll do that tomorrow and probably post a new issue on your Github then.

I second all of @steffilazerte's suggestions. The most important is the addition of automated tests https://github.com/KaiWenger/memochange/issues/6 (or, at the very least some kind of testing). I also believe your imports have to be fixed, see https://github.com/KaiWenger/memochange/issues/4, and its companion problem of confusing imports in examples https://github.com/KaiWenger/memochange/issues/5.

Most of the documentation issues in https://github.com/KaiWenger/memochange/issues/7 and https://github.com/KaiWenger/memochange/issues/8 are pretty minor and should be easy to fix. I do have some suggestions here I believe would make your package better, but it's fine if you don't follow through on, such as making S3 objects for the return values of e.g CUSUMfixed. These could be modeled after the objects in base R, such as stats::t.test. Such objects come in handy for many purposes, such as for extending your API, for adding generics such as plotting, for helping the user familiar with R, et cetera. Still, I will absolutely respect your choice if you decide not to implement them.

The issues with the high-level documentation of the package are more serious than the issues with the nitty-gritty, at least in terms of attracting users! I think it takes a little too much effort to get a good understanding of the why of your package. It's easy to understand the what of your package; it can be used to test for changes in persistence and means. Still, I have a hard time understanding exactly which function I should pick if I ever want to do a test for a change in persistence, even after reviewing your package. I understand that this is too difficult to address directly in the package docs (I imagine it's an active field of research in its own right), but that is why I suggest you give a short overview of all your functions and compare them in a vignette. It would also be nice to have more \seealso{} in the documentation. This would make it easier to browse the docs for the user. https://github.com/KaiWenger/memochange/issues/9.

As I alluded to, the lack of tests is a major issue. As the package now stands, I don't know if it does what it is supposed to. Ideally, the package would have two kinds of tests:

  1. Automated tests using e.g. testthat,
  2. Tests replicated trusted results from scientific publications.

The first one is most important as it highly increases the trust in the software.

I'll be following both my and @steffilazerte issues. If you have any questions or think I'm just horribly wrong about something, don't hesitate to talk there. Feel free to modify the issues to make e.g. checklists.

steffilazerte commented 4 years ago

Agreed :)

I particularly like the suggestion:

  1. Tests replicated trusted results from scientific publications.

If examples of this sort are included, it will help demonstrate to users that the package does what it is supposed to do. If tests of this sort are included, it will help ensure that future changes don't break any existing functionality.

JonasMoss commented 4 years ago

Hi again @KaiWenger, @Janis410!

Here is my paper review.

Paper Review

I think the paper is both clear and informative. It gives an understandable and short overview of the field of research and makes it easy to comprehend what the package does. It is written for a specialist audience though; it reads like a literature review at the beginning of a statistics article. (I would not able to review its quality as a literature review, as I am no specialist in time series analysis and know nothing about this specific area of time series analysis.)

Take a look at the review criteria, which states that the paper should include:

[A] summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience

Though I highly appreciate your paper (and I absolutely think you shouldn't delete it, you could include it with the package as a vignette for instance) I don't think the paper satisfies the point above.

Some suggestions for writing a paper more friendly to non-specialists is to:

  1. Try to live without some of the references; this is paper should be about "high-level functionality". Your high-level functionality is to check for changes in persistence change in mean under 0 < d < 0.5.
  2. Consider condensing your existing paragraphs considerably: Most of paragraph 2 and paragraph 3 are concerned with details about several testing procedures. Maybe you could say something along the lines of "the package implements the tests of A, B, C, and D." and then be done with it?
  3. Use much more time to explain the problem in the beginning. I think it would be very hard to understand what this package is about for non-specialists.
  4. Perhaps it would help to include a plot of a time series with a change in persistence, such as the one you already have in your readme?
  5. Your "statement of need" reads much less like a literature review and is more in line with what I expect from the "summary" part of a paper in this journal. Maybe you could expand on this to make a new summary?

Other Packages

As @steffilazerte mentions, you should write something about the existence or non-existence of other packages dealing with similar problems and how this package is different from them. I imagine this would include references to packages testing from breakpoints in stationary time series, for instance.

Specific Issues and Comments

These are the comments I have with your summary. Please keep in mind that the paper must be short, so when I say that a point is unclear it might be better to remove it from the paper entirely than trying to explain it.

For an I(d), or long-memory, process with 0 < d < 1, shocks neither die out quickly nor persist infinitely

Put the "0 < d < 1" in front of I(d) or rewrite, i.e if 0 < d < 1 then an I(d) process is a long-memory process. I suggest you drop the terms long-memory, short-memory and difference-stationary as they not used again in the paper and are unlikely to help beginners.

Busetti & Taylor (2004) and Leybourne & Taylor (2004) suggest approaches for testing the null of constant I(0)

You have forgotten to use Latex on I(0).

However, both approaches show serious distortions if neither the null nor the alternative is true, i.e. the series is constant I(1).

This is a surprising "i.e.", as the series could have been I(d) for some d in (0,1): There is no earlier indication in the text that I(0)/I(1) are the only possible alternatives. Consider a rewrite or the use of "e.g.".

Under the null the test assumes constant I(d) behavior with 0 ≤ d < 3/2. The approach suggested by Martins & Rodrigues (2014) is even able to identify changes from −1/2 < d1 < 2 to −1/2 < d2 < 2 with d1 ̸= d2. H

Why would you use or care about other tests then? Does Martins & Rodrigues (2014)'s test have horrible power?

Structural changes cannot only occur in the persistence, but also in the mean of a persistent time series.

I suggest you rewrite this, as it can be read as "structural changes can never only occur in the persistence [...]".

However, these tests for a mean shift are invalidated for d > 0. The problem is that the limiting distributions of the standard tests are different under long memory. Therefore, they reject the null hypothesis of a constant mean with probability of one asymptotically.

Please change "probability of one " to "probability one".

I suggest you remove "invalidated" and write inconsistent instead, as the notion of consistent tests is more well-known and unique than "valid". You could also go with "the tests do not work when d>0" as the summary is intended for a non-specialist audience.

The part with "Therefore, they reject the null hypothesis of a constant mean with probability of one asymptotically." sounds fishy. I understand that limiting distributions can change from context to context, but that does not seem to imply the "therefore" above. For instance, a change from N(0,1) to N(0,1-\epsilon) would not have your claimed property. It's also conceivable the null hypothesis would be accepted with probability 1 in the limit. (I am talking in generality here; i.e all testing problems).

Change-in-mean

A large body of literature handles this topic for weakly dependent I(0) time series.

Could you describe the R-packages made for this problem?

First, whether they are adapted on the CUSUM, sup-Wald, or Wilcoxon test. Second, which type of long-run variance estimator they apply in their test statistics. Early approaches utilize consistent estimates of the long-run variance (Horváth & Kokoszka (1997), Wang (2008), Dehling, Rooch, & Taqqu (2013)), while more recent contributions suggest self-normalized test statistics (Shao (2011), Iacone, Leybourne, & Taylor (2014), Betken (2016), Wenger & Leschinski (2019)). This may be the reason since it is found in Wenger et al. (2019) that self-normalized tests are robust against size distortions.

I believe this paragraph is far too technical for this paper:

First, whether they are adapted on the CUSUM, sup-Wald, or Wilcoxon test.

What are these?

Early approaches utilize consistent estimates of the long-run variance (Horváth & Kokoszka (1997), Wang (2008), Dehling, Rooch, & Taqqu (2013)), while more recent contributions suggest self-normalized test statistics (Shao (2011), Iacone, Leybourne, & Taylor (2014), Betken (2016), Wenger & Leschinski (2019)).

When you write "Early approaches utilize consistent estimates of the long-run variance " does that mean modern approaches use inconsistent estimates? What are self-normalized test statistics?

This may be the reason since it is found in Wenger et al. (2019) that self-normalized tests are robust against size distortions.

What is size distortions?

References

Change "Robinson, P.M." and others to "Robinson, P.M." In "Robinson, P. M., & others. (1995). Gaussian semiparametric estimation of long range dependence."

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

Janis410 commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

Janis410 commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

KaiWenger commented 4 years ago

Hi @steffilazerte and @JonasMoss,

first of all: thank you very much for your helpful and constructive thoughts and suggestions! We carefully considered all of your suggestions and think that our package improved significantly. We commented on all of your issues in #1, #2, #3, #4, #5, #6, #7, #8, and #9. In summary, we incorporated your remarks as follows:

We hope that both of you are satisfied with our modifications. We really put a lot of effort the last weeks into the improvement of our package according to your suggestions. We are looking forward hearing from you :)

Janis410 commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

JonasMoss commented 4 years ago

@KaiWenger

Your response to our programming comments is great! I left two comments on the Github page that are non-critical.

I really like your new summary paper too. The only issue I can see is that you should cite the strucchange package. Add that and I can check off the last box.

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

JonasMoss commented 4 years ago

@KaiWenger Please use the Journal of Statistical Software entry from https://cran.r-project.org/web/packages/strucchange/citation.html

KaiWenger commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

KaiWenger commented 4 years ago

@JonasMoss Thanks for your comments! We included the citation for the strucchange package in our software paper and treated your last two comments on our Github page :)

JonasMoss commented 4 years ago

@KaiWenger Great! Just add the DOI to the paper. =) The DOI is 10.18637/jss.v007.i02

mbobra commented 4 years ago

👋 @KaiWenger could you tick off the boxes that you've completed in https://github.com/openjournals/joss-reviews/issues/1820#issuecomment-547599162? Thanks :)

Janis410 commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

steffilazerte commented 4 years ago

Hi @KaiWenger, @Janis410, Wow! So much work in such a short time. Adding the tests, vignettes and annotating the examples makes this a very solid package.

I also like the re-write of the article, it's much more accessible and is very clear about it's purpose.

I have a couple of editorial (and occasionally nit-picky) comments for the article and one comment for the repository:

My final suggestion is up to you:

A really great solid package, thanks!

Janis410 commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...