openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
697 stars 36 forks source link

[REVIEW]: MarSwitching.jl: A Julia package for Markov switching dynamic models #6441

Closed editorialbot closed 1 month ago

editorialbot commented 5 months ago

Submitting author: !--author-handle-->@m-dadej<!--end-author-handle-- (Mateusz Dadej) Repository: https://github.com/m-dadej/MarSwitching.jl Branch with paper.md (empty if default branch): joss Version: v0.2.3 Editor: !--editor-->@jbytecode<!--end-editor-- Reviewers: @y1my1, @mkitti Archive: 10.5281/zenodo.11546432

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388"><img src="https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388/status.svg)](https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@y1my1 & @mkitti, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jbytecode know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @y1my1

📝 Checklist for @mkitti

editorialbot commented 5 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1137/141000671 is OK
- 10.2307/1912559 is OK
- 10.1142/S0219024902001523 is OK
- 10.1016/j.ijforecast.2014.03.014 is OK
- 10.1016/j.ejrh.2023.101374 is OK
- 10.1016/j.csfx.2021.100059 is OK
- 10.1016/j.joi.2020.101081 is OK
- 10.32614/rj-2019-012 is OK
- 10.25080/majora-92bf1922-011 is OK
- 10.2139/ssrn.1714016 is OK
- 10.2307/1392086 is OK
- 10.1017/S0305004100030401 is OK
- 10.1016/0304-4076(94)90036-1 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 5 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=0.04 s (1236.4 files/s, 180186.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
SVG                              9              0              0           2659
Julia                           11            405            322           1111
Markdown                         8            243              0            902
CSV                              2              0              0            669
YAML                             8              5              8            189
TeX                              2             17              0            178
CSS                              1             12              0             76
R                                2             21              9             44
TOML                             2              4              0             29
Python                           1              5              1             26
MATLAB                           1              7              8             24
JSON                             1              4              0             17
-------------------------------------------------------------------------------
SUM:                            48            723            348           5924
-------------------------------------------------------------------------------

Commit count by author:

   185  SquintRook
    38  Mateusz Dadej
editorialbot commented 5 months ago

:warning: An error happened when generating the pdf. Problem with affiliations for Mateusz Dadej, perhaps the affiliations index need quoting?.

editorialbot commented 5 months ago

Paper file info:

📄 Wordcount for paper.md is 1300

✅ The paper includes a Statement of need section

editorialbot commented 5 months ago

License info:

✅ License found: MIT License (Valid open source OSI approved license)

jbytecode commented 5 months ago

@editorialbot generate pdf

editorialbot commented 5 months ago

:warning: An error happened when generating the pdf. Problem with affiliations for Mateusz Dadej, perhaps the affiliations index need quoting?.

jbytecode commented 5 months ago

@m-dadej - Could you please fix the paper metadata before reviewing process starts? The output of the previous message may help how to indicate the error. Thank you in advance.

m-dadej commented 5 months ago

@editorialbot generate pdf

editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jbytecode commented 5 months ago

@y1my1, @mkitti - Dear reviewers, you can start with creating your task lists. In that list, there are several tasks. Whenever you perform a task, you can check on the corresponding checkbox. Since the review process of JOSS is interactive, you can always interact with the author, the other reviewers, and the editor during the process. You can open issues and pull requests in the target repo. Please mention the url of this page in there so we can keep tracking what is going on out of our world.

Please create your tasklist by typing

@editorialbot generate my checklist

Thank you in advance.

jbytecode commented 4 months ago

@y1my1, @mkitti After two weeks, may I kindly request our reviewers to begin their review and update their status, please?

y1my1 commented 4 months ago

Review checklist for @y1my1

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

y1my1 commented 4 months ago

Overall, this is a good mature package for achieving what the author intends to do. I could tell the author has put a lot of effort into programming the software and writing the documentation. I have some minor issues with the following

Functionality

I was trying to verify whether the package can replicate the results of a basic regression model I found here (data could be found here). However, the results I got from this package seem to be quite different from what I can get from statsmodel (Python) and Stata (statsmodel and Stata gave almost the same results). See attached graph below

image

I tried to compare the performance of the corresponding methods in the package and statsmodel (Python). I generated a simulated data of 10000 rows, following this example. The results don't seem to be that different, at least on my MacBook (see below). However, I may not be doing it the right way; it would be good if the author could document how he did the performance test.

image

Documentation

There are no community guidelines, but this should be an easy fix.

mkitti commented 4 months ago

Review checklist for @mkitti

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

m-dadej commented 4 months ago

Thank you for your comments y1my1 and sorry for delayed response, I was busy with my Phd.

Regarding the functionality comment. The way you defined a model with MSModel function won't give you the same one as in statsmodels. Here is a correct command:

model = MSModel(df[2:end, 2], 2, exog_switching_vars = df[begin:end-1, 2], switching_var = false)

exog_vars is for non-switching exogenous variables and as the equation in the example suggests the variance of the model is not regime-dependent (thus I specified switching_var = false). A default value is a regime-dependent variance. I am very much open to suggestion on how to make the function more intuitive to use :)

That being said, a current version of the package still gives some of the estimates a bit off than in statsmodels:

image

The one intercept now is way different. I think some time ago I compared it but disregarded the issue as the rest of the examples matches my package. Plus, the Matlab package MS_Regress also had different estimates. But anyway, I checked my code and the issue was solved in the most recent commit (https://github.com/m-dadej/MarSwitching.jl/commit/6fa582df3fedd16ae64e5ffeaa5ba194f362226a). (tl;dr Before the commit, I was adding a small number to state probability to solve an underflow error that have been happening some time ago).

If you install the github version of the package you can now replicate the results from statsmodels.

Regarding the performance comparison. The code for benchmark is in benchmark folder. The readme example can give different results as benchmark (in terms of relative performance), because the model is different (less complex). In the benchmark code I estimate 3 regime, 1 non-switching and 1 switching exogenous variable model.

That being said, I compare the readme example and the relative performance between statsmodel is smaller. MarSwitching is ~2x faster. But I am unable to replicate your results, even though the code on your screen shoot seems to be fine.

About community guidelines, it's a good idea. I will write them soon.

jbytecode commented 4 months ago

@m-dadej - Do you think the first sentence in the article is a bit too assertive? I believe there are other packages in Julia that do similar or identical tasks. Perhaps if you soften the first sentence and refer to packages that perform the same or similar tasks, it would be better.

m-dadej commented 4 months ago

I have not seen any package that do identical task. There is a package to model hidden Markov models (https://github.com/gdalle/HiddenMarkovModels.jl) which may allow similar estimation. However as you can see in its example, it requires additional programming from user in order to estimate a rather simple Markov switching model. Do you have any other package in mind that I missed?

jbytecode commented 4 months ago

@m-dadej - Yes, you can refer to the package, emphasizing that it can address the same types of problems with additional effort, while highlighting that your package is capable of efficiently handling these tasks in a smoother manner rather than a sharp one like the current one. Thank you in advance.

jbytecode commented 3 months ago

@y1my1, @mkitti - May I kindly ask you to update your review status? Thank you in advance.

mkitti commented 3 months ago

@editorialbot generate pdf

mkitti commented 3 months ago

For the examples, an environment including a Project.toml and Manifest.toml should be included with the repository. This would allow user to reproduce the examples with the exact versions of the packages used.

Some of the examples use data included in the package as follows:

df = CSV.read("my_assets/philips.csv", DataFrame, missingstring = "NA")

df = CSV.read("my_assets/df_spx.csv", DataFrame)

I might make those relative to the path of the package that the user installed so that no additional work needs to be done for the user to try the examples. They would otherwise have to clone the repository, but this is not necessary since the files are already installed.

philips_csv_path = joinpath(
    dirname(pathof(MarSwitching)),
    "../docs/src/man/examples/my_assets/philips.csv"
)
df = CSV.read(philips_csv_path, DataFrame, missingstring = "NA")

df_spx_csv_path = joinpath(
    dirname(pathof(MarSwitching)),
    "../docs/src/man/examples/my_assets/df_spx.csv"
)
df = CSV.read(df_spx_csv_path, DataFrame)
mkitti commented 3 months ago

The reference and limitations of HiddenMarkovModels.jl would be greatly appreciated. Users may be trying to decide between the two packages and a comparison of this package that would be very useful.

mkitti commented 3 months ago

I see Community Guidelines in the README.md of the Github repository, but not in the documentation itself.

https://github.com/m-dadej/MarSwitching.jl/blob/main/README.md#contributing

Please include this information in https://m-dadej.github.io/MarSwitching.jl/

mkitti commented 3 months ago

Overall, MarSwitching.jl is an important contribution to the field, allowing Markov switching dynamic models to be simulated and analyzed in the Julia Programming Langauge. Comparisons are provided to packages in other langauges. Comprehensive examples are provided. The documentation and paper are well written and organized.

Additional comparisons and contrasts to existing packages in the Julia ecosystem would be appreciated as others have noted in this review. I also suggested a few minor changes to make it easier to run the examples by referencing data files relative to the installed package.

If the concerns raised above in my own review and those are the other reviewer are addressed, I would recommend the paper for publication.

I apologize for the delay in reviewing the submission.

mkitti commented 3 months ago

Minor correction

MarSwitching.jl is a package for estimating Markov switching dynamic models (also called regime switching) for Julia. This is a class of models with time-varying coefficients depending on an unobservable state/regime that follows [a] Markov process. The package provides tools for estimation, inference and simulation of the models.

The "a" should be inserted.

y1my1 commented 3 months ago

Thank you for your comments y1my1 and sorry for delayed response, I was busy with my Phd.

Regarding the functionality comment. The way you defined a model with MSModel function won't give you the same one as in statsmodels. Here is a correct command:

model = MSModel(df[2:end, 2], 2, exog_switching_vars = df[begin:end-1, 2], switching_var = false)

exog_vars is for non-switching exogenous variables and as the equation in the example suggests the variance of the model is not regime-dependent (thus I specified switching_var = false). A default value is a regime-dependent variance. I am very much open to suggestion on how to make the function more intuitive to use :)

That being said, a current version of the package still gives some of the estimates a bit off than in statsmodels:

image

The one intercept now is way different. I think some time ago I compared it but disregarded the issue as the rest of the examples matches my package. Plus, the Matlab package MS_Regress also had different estimates. But anyway, I checked my code and the issue was solved in the most recent commit (m-dadej/MarSwitching.jl@6fa582d). (tl;dr Before the commit, I was adding a small number to state probability to solve an underflow error that have been happening some time ago).

If you install the github version of the package you can now replicate the results from statsmodels.

Regarding the performance comparison. The code for benchmark is in benchmark folder. The readme example can give different results as benchmark (in terms of relative performance), because the model is different (less complex). In the benchmark code I estimate 3 regime, 1 non-switching and 1 switching exogenous variable model.

That being said, I compare the readme example and the relative performance between statsmodel is smaller. MarSwitching is ~2x faster. But I am unable to replicate your results, even though the code on your screen shoot seems to be fine.

About community guidelines, it's a good idea. I will write them soon.

The performance difference we saw may be because of the different environments we are using. I updated Julia to the newest version (1.10.3) and was able to see MarSwitching runs 3 times faster compared with statsmodel. I would recommend adding a note of what environment is used. In addition, I am not sure I see people use "," as equivalent to "." anywhere, e.g., "6,7", which is hard to read, at least to me.

image image
jbytecode commented 3 months ago

@m-dadej - Please consider the changes and additions suggested by our reviewers, and then ping us again. Thank you in advance.

m-dadej commented 2 months ago

Hey @jbytecode I added changes according to the reveiwer'ssuggestions.

@mkitti project.toml and manifest.toml are now in the examples folder and the scripts now have the path generating function you sugested. The contribution guidelines are also in docs now.

About the HiddenMarkovModels.jl comparison. The last paragraph in statement of need now reads:

Despite the popularity of the method, MarSwitching.jl is, at the moment, the only package that allows for effortless estimation of Markov switching models with Julia programming language. At the same time, it is implemented purely in this language. For more general modeling with hidden Markov models, Julia users may find HiddenMarkovModels.jl [Dalle2024] package useful as well. HiddenMarkovModels.jl offers more generic approach to programming hidden Markov models, albeit requiring user-side development of certain estimation algorithms for Markov switching models, as well as model inference functions.

I did not elaborate further as the paper is already pretty long. Of course I am open to any suggestions.

@y1my1 I added the information regarding the environment. The paragraph below the benchmark table now reads:

Software versions: MarSwitching.jl v0.2.2, Statsmodels v0.14.1, MSwM v1.5, MS_Regress v1.11. The programming languages versions were: Julia v1.10.1, Python v3.12.2, R v4.2.1 and MATLAB vR2024a. Calculations were run on: Windows 11 x64 Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz, 2592 Mhz, 6 Core(s), 12 Logical Processor(s).

The commas are removed as well.

jbytecode commented 2 months ago

@y1my1, @mkitti - Dear reviewers, how are things going on? Is there anything to update? Could you please update your status? Thank you in advance.

jbytecode commented 2 months ago

Hi all! Pinging to see if we have any progress on this submission. Thank you in advance!

mkitti commented 2 months ago

@editorialbot generate pdf

editorialbot commented 2 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mkitti commented 2 months ago

The mention and comparison to HiddenMarkovModels.jl is very welcome, and I appreciate the author adding this paragraph. Ultimately, I think it will be very helpful for readers in deciding between packages

Despite the popularity of the method, MarSwitching.jl is, at the moment, the only package that allows for effortless estimation of Markov switching models with Julia programming language.

The "effortless" seems hyperbolic here. I would appreciate if the author could replace effortless by something more specific, for example:

MarSwitching.jl is, at the moment, the only package that allows for estimation of Markov switching models with Julia programming language by specifying a minimal set of regime switching parameters.

jbytecode commented 2 months ago

@mkitti - Thank you for the suggestion!

@m-dadej - Could you please take the required action? Thank you in advance!

m-dadej commented 2 months ago

Hey @mkitti, @jbytecode

I replaced the sentence accordingly. I think the one you suggested is ok. Thank you

jbytecode commented 2 months ago

@editorialbot generate pdf

editorialbot commented 2 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mkitti commented 2 months ago

Thank you for the change. I recommend the manuscript for publication.

jbytecode commented 2 months ago

@mkitti - Thank you so much for your effort in reviewing!

jbytecode commented 2 months ago

@y1my1 - Could you please update your status? Thank you in advance.

y1my1 commented 2 months ago

Sorry for the delay. I have updated the status. I recommend the paper for publication as well. Thanks.

jbytecode commented 2 months ago

@editorialbot check references

editorialbot commented 2 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1137/141000671 is OK
- 10.2307/1912559 is OK
- 10.1142/S0219024902001523 is OK
- 10.1016/j.ijforecast.2014.03.014 is OK
- 10.1016/j.ejrh.2023.101374 is OK
- 10.1016/j.csfx.2021.100059 is OK
- 10.1016/j.joi.2020.101081 is OK
- 10.32614/rj-2019-012 is OK
- 10.25080/majora-92bf1922-011 is OK
- 10.2139/ssrn.1714016 is OK
- 10.21105/joss.06436 is OK
- 10.2307/1392086 is OK
- 10.1017/S0305004100030401 is OK
- 10.1016/0304-4076(94)90036-1 is OK

MISSING DOIs

- No DOI given, and none found for title: R: A Language and Environment for Statistical Comp...
- No DOI given, and none found for title: MSwM: Fitting Markov Switching Models

INVALID DOIs

- None
jbytecode commented 2 months ago

@editorialbot generate pdf

editorialbot commented 2 months ago

:warning: An error happened when generating the pdf.

jbytecode commented 2 months ago

@m-dadej - We have a compiling issue of the pdf. The error message indicates

Warning:  Could not convert image /tmp/tex2pdf.-74ed9d6bad7c1236/regime_probs.svg: check that rsvg-convert is in path.
  rsvg-convert: createProcess: posix_spawnp: does not exist (No such file or directory)

so I think it is about rendering the svg image. Is it possible to convert it to png or use the png version of the image?

jbytecode commented 2 months ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance

m-dadej commented 2 months ago

I changed to .png. It should be ok now

@editorialbot generate pdf