Open editorialbot opened 3 months ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.90 T=0.15 s (764.3 files/s, 298530.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 67 3037 4303 8814
Jupyter Notebook 30 0 24675 3907
Markdown 10 99 0 282
TeX 1 25 0 199
reStructuredText 2 152 208 153
YAML 5 20 7 147
DOS Batch 1 8 1 26
make 1 4 7 9
TOML 1 1 0 4
-------------------------------------------------------------------------------
SUM: 118 3346 29201 13541
-------------------------------------------------------------------------------
Commit count by author:
344 Scott Linderman
178 Peter G. Chang
135 xinglong
131 Kevin P Murphy
83 karalleyna
79 gileshd
69 Gerardo Duran-Martin
60 petergchang
26 Caleb Weinreb
18 libby
14 kostastsa
13 slinderman
10 Elizabeth DuPre
8 Kevin Murphy
6 andrewwarrington
6 davidzoltowski
6 patel-zeel
4 Ravin Kumar
3 Aleyna Kara
2 Yixiu Zhao
2 Zeel B Patel
2 partev
1 Collin Schlager
1 Jake VanderPlas
1 Jason Davies
1 RaulPL
1 Xinglong
1 Xinglong Li
1 xinglong-li
Paper file info:
📄 Wordcount for paper.md
is 960
✅ The paper includes a Statement of need
section
License info:
✅ License found: MIT License
(Valid open source OSI approved license)
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1146/annurev-neuro-092619-094115 is OK
- 10.1017/CBO9781139344203 is OK
- 10.48550/arXiv.2305.16543 is OK
- 10.48550/arXiv.2306.03291 is OK
- 10.48550/arXiv.2305.19535 is OK
- 10.1038/s41592-024-02318-2 is OK
- 10.25080/majora-92bf1922-011 is OK
- 10.1017/cbo9780511790492 is OK
- 10.1016/j.tree.2007.10.009 is OK
- 10.1198/073500102753410408 is OK
- 10.3402/tellusa.v56i5.14462 is OK
- 10.1145/355656.355657 is OK
- 10.1109/TAC.2020.2976316 is OK
- 10.1109/TSP.2021.3103338 is OK
MISSING DOIs
- No DOI given, and none found for title: Probabilistic Machine Learning: Advanced Topics
- No DOI given, and none found for title: JAX: composable transformations of Python+NumPy pr...
- No DOI given, and none found for title: PyHSMM: Bayesian inference in HSMMs and HMMs
- No DOI given, and none found for title: Code Companion for Bayesian Filtering and Smoothin...
- No DOI given, and none found for title: SSM: Bayesian Learning and Inference for State Spa...
- No DOI given, and none found for title: JSL: JAX State-Space models (SSM) Library
- No DOI given, and none found for title: hmmlearn
- No DOI given, and none found for title: Structural Time Series (STS) in JAX
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
👋 @gdalle, @thomaspinder, could you please update us on how it's going with your reviews?
@osorensen expecting to be done by September 1st.
Haven't started it yet so I will probably need two more weeks.
Firstly, I'd like to congratulate the authors and contributors of Dynamax for creating a nicely designed package in JAX. Dynamax enables practitioners to easily fit state-space models (SSM), whilst simultaneously allowing researchers to implement their own custom SSM approaches. The repo's corresponding paper is well written and provides a clear and concise summary of the paper. Finally, I enjoyed reading the documentation of Dynamax; the large number of examples and use-cases is great for new users of the package.
I have broken my review up into two sections. The first section lists the blocking issues I have that prevent me from being able to mark each item of the reviewer's checklist as complete. In the second section, I suggest some things that I would advise the authors to do in Dynamax; however, I do not consider these as blocking concerns.
For me, running
git clone git@github.com:probml/dynamax.git
cd dynamax
pip install -e '.[dev]'
pytest dynamax
as per the docs, threw errors on a Mac M1. I was using Python 3.10 and a fresh virtual environment.
On this topic though, I see that you're Github testing workflow uses a pinned Python version and machine. It could be good to run your tests on all supported Python versions and a Mac and Linux machine. I have opened a PR with a suggestion that reflects this comment.
The docstrings within Dynamax are inconsistent. Taking, for example, dynamax.generalized_gaussian_ssm.inference
as an example file, there are no docstrings for the methods or arguments of the public classes, yet there are comprehensive docstrings for the private functions. This is unconventional, and I would encourage the authors to more widely document the classes, methods, and functions of Dynamax so as to make the package easier for people to use. Additionally, typing such as the below from dynamax.generalized_gaussian_ssm.models
is not particularly descriptive.
:param initial_mean: $m$
:param initial_covariance: $S$
It would be good to see more comprehensive documentation. To rigorously test documentation thresholds in the future, you may consider using interrogate
. This will measure the number of objects for which a docstring is supplied and raise errors/warnings when the coverage decreases. I have opened a suggestive PR for how this can be achieved.
33.1% of classes/methods/functions are covered by tests. It would be good to see this increased. This was calculated with interrogate dynamax
.
The documentation is broken and many notebooks do not render - Example. My suggestion would be to use something like nbsphinx to execute the notebooks each time the documentation is built, and to fail when a notebook does not execute top-to-bottom.
Your contribution guidelines are incomplete. To push a change, one would need to add and commit the change before pushing - this is missing from the document.
The package has no typing. Additionally, many functions use variable names which are very hard to interpret e.g., def _predict(m, P, f, Q, u, g_ev, g_cov)
in dynamax.generalized_gaussian_ssm.inference
. The combination of no typing and terse variable naming makes the code challenging to read, particularly for new users and/or those new to SSMs. I would strongly encourage the authors to either use more descriptive variable names, or add typing to their code. JaxTyping is excellent for this.
Some of the typing used is incorrectly done. Take ParamsGGSSM.initial_mean
in dynamax.generalized_gaussian_ssm.models
as an example. In this function, the typing on initial_mean
is incorrect, as outlined here in JaxTyping. A change to the effect of
- initial_mean: Float[Array, "state_dim"]
+ initial_mean: Float[Array, " state_dim"]
should be made.
You have some very large git pack files:
❯ du -ah dynamax | sort -rh | head -n 10
224M dynamax
209M dynamax/.git/objects/pack
209M dynamax/.git/objects
209M dynamax/.git
208M dynamax/.git/objects/pack/pack-aaa194639683ed252462b75ee0ec2203edc5c09f.pack
You may consider running git gc
to clean these up unless you need them.
I would suggest tighter Python bounds. Currently your lower bound is 3.6; a version which is considered end-of-life by the Python Org.: https://devguide.python.org/versions/#unsupported-versions As an aside, I appreciate that it's specified in your setup.py, but explicitly stating the supported Python version(s) in your project's README is nice.
It is strongly advised to migrate setup.py
and setup.cfg
files to a pyproject.toml
. I would encourage the authors to do this in order to future-proof Dynamax.
Thanks a lot for your thorough review, @thomaspinder. @slinderman, you're welcome to start addressing the issues whenever you like.
Thank you, @osorensen and @thomaspinder! We will start working on these issues and suggestions ASAP, and I'll keep you posted.
Meanwhile I'm making my way through my own review. I will gather my remarks about code and documentation in two issues on the dynamax repo:
I might be limited by my very ancient Python skills for installation and testing of dynamax, so I'd appreciate a hand in figuring out the errors I observe.
As for the paper itself, it is very clear and does a good job of introducing dynamax. I have three minor suggestions:
Dynamax supports canonical SSMs and allows the user to construct bespoke models as needed.
Can you give more details on how this is implemented API-wise? For instance, how generic can observation distributions be? This question was a major motivation for my own take on HMMs, coded in Julia (https://joss.theoj.org/papers/10.21105/joss.06436). Hopefully I didn't misrepresent dynamax in the state of the art there.
Dynamax provides a unique combination of low-level inference algorithms and high-level modeling objects that can support a wide range of research applications in JAX.
Would it be possible to provide concrete examples of what is missing from the previous libraries? Obviously JAX support is a big aspect, since I know that some of them are coded in Numpy (hmmlearn) or PyTorch (pomegranate)
Parallel message passing routines that leverage GPU or TPU acceleration to perform message passing in sublinear time.
What do you mean by sublinear time? Isn't it just (roughly speaking) total sequential time divided by amount of parallelism?
I'm halfway through the documentation and a significant number of code examples are broken by the Numpy 2.0 release (about 1 per page). I think it would be a good idea to fix them before I make a second (complete) pass.
@slinderman, ref the post by @gdalle, could you please follow up and ping us here when done?
Hi @osorensen, thanks for the reminder. @gdalle, sorry for the rendering issue. I thought we had fixed this by pinning to Numpy<2.0, but somehow this slipped through the cracks. I will fix it and let you know when it's ready for another pass.
@slinderman any updates on this?
Hi @osorensen, sorry for the delay. We pinned Numpy < 2.0 and recreated the documentation, and now almost all of the rendering issues should be fixed. There is a known issue with the HMC notebook (see https://github.com/probml/dynamax/issues/384), but the rest look good.
Regarding the overall review progress, we have begun work on the feedback we received from @thomaspinder. It is taking more time than anticipated to fix the typing issues, but we aim to wrap that up in the coming weeks. We greatly appreciate the reviewers' feedback, and we appreciate their patience while we work to address their comments.
Thanks for the update, @slinderman
@slinderman thanks for the fixes. Can you tag a new release containing them, so that running the documentation examples does not require cloning the repo?
Hi @gdalle, of course. Please see https://github.com/probml/dynamax/releases/tag/0.1.5, which is now available on PyPI as well.
Submitting author: !--author-handle-->@slinderman<!--end-author-handle-- (Scott Linderman) Repository: https://github.com/probml/dynamax Branch with paper.md (empty if default branch): paper Version: v0.1.4 Editor: !--editor-->@osorensen<!--end-editor-- Reviewers: @thomaspinder, @gdalle Archive: Pending
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@thomaspinder & @gdalle, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @osorensen know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @gdalle
📝 Checklist for @thomaspinder