Closed virgesmith closed 5 years ago
Hey @virgesmith,
Thank you for this detailed review. I will try to address all your points in the following. Some point were already addressed by the other reviewer.
Installation
The package installed correctly via pip. Dependencies are dealt with by setup.py but
scipy
(which is required for some of the algorithms) is not installed automatically, and has to be installed manually. Perhaps this could be automatic, and/or see comment in Documentation section below.
I added an extra-requirement in the setup.py, which enables installing scipy on the fly if wanted by the user.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/8a8beb73c6d15eaa519520c5500628539b6fb658
Software
For solver=1,2 there is no check for singular matrix so you get a
ZeroDivisionError: float division
. Algorithms 3,4,5 report this situation specifically (as either an error or a warning). If performance is not impacted, could you check for this? Or simply catch numerical errors and re-raise with a more suggestive error message(s).
I added a catch block for ZeroDivisionErrors and a warning will be raised, that the chosen algorithm is not suitable for the given matrix. An array of NaNs will be returned in that case.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/f5faf19f248bd0298b1038b3d96349c9c84c3b8a
Tests
I found the solver to work correctly on some basic but inexhaustive tests of my own - discretising the heat equation both tri- and pentadiagonally and comparing the result to an analytic solution.
I think there should be more tests than are currently implemented in the test suite, and they should include some more realistic examples as well as the random matrix tests. There should also be some basic unit tests of the matrix utilities, e.g. check full->banded->full gives the original matrix.
I added tests for matrix conversation, since you are totally right, that this should be checked!
Done with: https://github.com/GeoStat-Framework/pentapy/commit/eed9436e0d482db3fa188d8e5c5e15a55a77159a (some bugfixes afterwards)
I don't really think, that the test suite should include "real" examples, since the test suite should just check for correct calculations, which are best ensured by a random input.
From a code-testing point of view, there is no gain in quality check by using hard coded data in my opinion. What do you think?
Documentation on how to run (and extend) the package tests should be provided in the community guidelines, if the author is willing to accept contributions.
I added a CONTRIBUTING.md to the base folder like this one: https://github.com/GeoStat-Framework/GSTools/blob/master/CONTRIBUTING.md
Done with: https://github.com/GeoStat-Framework/pentapy/commit/c9e6bf8f4b1b333ccb22eca9509870fca135906d
I will update this file in order to explain how to run the tests and how they are written.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/7db6dd347a9d811259ba66d924bf0c3acacd46be
Performance
My tests agree with the claims in the paper. But the code only supports 5 of the 7 algorithms compared in the graphs in the README/readthedocs, could those graphs be made consistent with the one in the paper?
I updated the graph in the README to include the same plot as in the paper.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/74de4c243a90e2dd3fec9fd6e238f5160f124354
Documentation
The README looks a little rushed, e.g. "pentapy is a toolbox to deal with pentadiagonal matrices in Python" is a bit vague and to a large extent the README just duplicates the readthedocs documentation. Personally, I would have a README containing only a statement of intent, authorship and community guidelines, basic installation instructions, and links to detailed documentation and examples (in readthedocs).
I updated the README in order to better explain what the package can do.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/7f3695dd56c4ae81edea41ec7b4a1c51b35cf1f0
I think it is really subjective how the README.md should look like. Within the GeoStat-Framework we decided to have a comprehensive README that is in line with the frontpage of the documentation. I hope you are OK with leaving it that way to have a common look of the packages.
Re: community guidelines, I don't see "clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support".
As mentioned above, I added a CONTRIBUTING.md.
Whilst
help(pentapy)
provides API-level documentation (some typos/spelling issues here), I think this should also be in the main documentation, thesolve
andcreate_full
methods are only dealt with in the one example given. Other methods:diag_indices
,shift_banded
,create_banded
are not mentioned at all.
Did you see, that the documentation also inlcudes a full API-documentation? https://geostat-framework.readthedocs.io/projects/pentapy/en/latest/package.html
Or should there be an explicit tutorial on that? I think the doc-string is quite comprehensive and self-explaining.
The Installation and Requirements sections of the README could be merged and made clearer, esp. that
scipy
is an optional dependency that require.
I added an explanation for the extra requirements in the README.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/73578dacc7843065262f990a83e67ca0d18f94b0
Paper
I think the "unique selling point" of this package is a very siginificant improvement in performance over current implementations/algorithms, and this should be emphasised more in the paper (and the README).
I added another sentence about the performance improvements to the paper.
Done with: https://github.com/GeoStat-Framework/pentapy/commit/868beada4402f94aea43a53699574a8e3ec91421
I hope my proposed changes are convincing.
Sebastian.
@MuellerSeb thanks for addressing all those points. I'm happy now. (My fault re: the API docs I missed them completely.)
Overview
This is a neat and well-defined package that provides a big performance improvement on existing
python
algorithms for solving linear systems of pentadiagonal matrices, and I found it to work as claimed. My main reservation is the documentation, which I think could be improved somewhat, but if the points I make below are addressed/defended satisfactorily, I would be happy to accept it for publication in JOSS.Installation
The package installed correctly via pip. Dependencies are dealt with by setup.py but
scipy
(which is required for some of the algorithms) is not installed automatically, and has to be installed manually. Perhaps this could be automatic, and/or see comment in Documentation section below.Software
For solver=1,2 there is no check for singular matrix so you get a
ZeroDivisionError: float division
. Algorithms 3,4,5 report this situation specifically (as either an error or a warning). If performance is not impacted, could you check for this? Or simply catch numerical errors and re-raise with a more suggestive error message(s).Tests
I found the solver to work correctly on some basic but inexhaustive tests of my own - discretising the heat equation both tri- and pentadiagonally and comparing the result to an analytic solution.
I think there should be more tests than are currently implemented in the test suite, and they should include some more realistic examples as well as the random matrix tests. There should also be some basic unit tests of the matrix utilities, e.g. check full->banded->full gives the original matrix.
Documentation on how to run (and extend) the package tests should be provided in the community guidelines, if the author is willing to accept contributions.
Performance
My tests agree with the claims in the paper. But the code only supports 5 of the 7 algorithms compared in the graphs in the README/readthedocs, could those graphs be made consistent with the one in the paper?
Documentation
The README looks a little rushed, e.g. "pentapy is a toolbox to deal with pentadiagonal matrices in Python" is a bit vague and to a large extent the README just duplicates the readthedocs documentation. Personally, I would have a README containing only a statement of intent, authorship and community guidelines, basic installation instructions, and links to detailed documentation and examples (in readthedocs).
Re: community guidelines, I don't see "clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support".
Whilst
help(pentapy)
provides API-level documentation (some typos/spelling issues here), I think this should also be in the main documentation, thesolve
andcreate_full
methods are only dealt with in the one example given. Other methods:diag_indices
,shift_banded
,create_banded
are not mentioned at all.The Installation and Requirements sections of the README could be merged and made clearer, esp. that
scipy
is an optional dependency that require.Paper
I think the "unique selling point" of this package is a very siginificant improvement in performance over current implementations/algorithms, and this should be emphasised more in the paper (and the README).