SLEPLET: Slepian Scale-Discretised Wavelets in Python (#148)

pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/

89 stars 33 forks source link

SLEPLET: Slepian Scale-Discretised Wavelets in Python (#148) #149

Closed paddyroddy closed 1 week ago

paddyroddy commented 7 months ago

Submitting Author: Patrick J. Roddy (@paddyroddy) All current maintainers: (@paddyroddy) Package Name: SLEPLET One-Line Description of Package: Slepian Scale-Discretised Wavelets in Python Repository Link: https://github.com/astro-informatics/sleplet Version submitted: v1.4.4 Editor: Szymon Moliński (@SimonMolinsky ) Reviewer 1: Shannon Quinn (@magsol ) Reviewer 2: Jakub Tomasz Gnyp (@gnypit ) Archive: 10.5281/zenodo.7268074 JOSS DOI: 10.21105/joss.05221 Version accepted: 1.4.7 Date accepted (month/day/year): 2024/05/21

Code of Conduct & Commitment to Maintain Package

[X] I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package after should it be accepted.
[X] I have read and will commit to package maintenance after the review as per the pyOpenSci Policies Guidelines.

Description

Include a brief paragraph describing what your package does:

SLEPLET is a Python package for the construction of Slepian wavelets in the spherical and manifold (via meshes) settings. SLEPLET handles any spherical region as well as the general manifold setting. The API is documented and easily extendible, designed in an object-orientated manner. Upon installation, SLEPLET comes with two command line interfaces - sphere and mesh - that allow one to easily generate plots on the sphere and a set of meshes using plotly. Whilst these scripts are the primary intended use, SLEPLET may be used directly to generate the Slepian coefficients in the spherical/manifold setting and use methods to convert these into real space for visualisation or other intended purposes. The construction of the sifting convolution was required to create Slepian wavelets. As a result, there are also many examples of functions on the sphere in harmonic space (rather than Slepian) that were used to demonstrate its effectiveness. SLEPLET has been used in the development of various papers.

Scope

Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
- [ ] Data retrieval
- [ ] Data extraction
- [X] Data processing/munging
- [ ] Data deposition
- [ ] Data validation and testing
- [X] Data visualization[^1]
- [ ] Workflow automation
- [ ] Citation management and bibliometrics
- [ ] Scientific software wrappers
- [ ] Database interoperability

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[ ] Pangeo
- [ ] My package adheres to the Pangeo standards listed in the pyOpenSci peer review guidebook

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
- Who is the target audience and what are scientific applications of this package?

Many fields in science and engineering measure data that inherently live on non-Euclidean geometries, such as the sphere. Techniques developed in the Euclidean setting must be extended to other geometries. Due to recent interest in geometric deep learning, analogues of Euclidean techniques must also handle general manifolds or graphs. Often, data are only observed over partial regions of manifolds, and thus standard whole-manifold techniques may not yield accurate predictions. Slepian wavelets are designed for datasets like these. Slepian wavelets are built upon the eigenfunctions of the Slepian concentration problem of the manifold: a set of bandlimited functions that are maximally concentrated within a given region. Wavelets are constructed through a tiling of the Slepian harmonic line by leveraging the existing scale-discretised framework. Whilst these wavelets were inspired by spherical datasets, like in cosmology, the wavelet construction may be utilised for manifold or graph data.

Are there other Python packages that accomplish the same thing? If so, how does yours differ?

To the author's knowledge, there is no public software that allows one to compute Slepian wavelets (or a similar approach) on the sphere or general manifolds/meshes. SHTools is a Python code used for spherical harmonic transforms, which allows one to compute the Slepian functions of the spherical polar cap. A series of MATLAB scripts exist in slepian_alpha, which permits the calculation of the Slepian functions on the sphere. However, these scripts are very specialised and hard to generalise.

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted: #148

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

[X] does not violate the Terms of Service of any service it interacts with.
[X] uses an OSI approved license.
[X] contains a README with instructions for installing the development version.
[X] includes documentation with examples for all functions.
[X] contains a tutorial with examples of its essential functions and uses.
[X] has a test suite.
[X] has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

[ ] Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Checks

- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

[x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

[X] I have read the author guide.
[X] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

[X] Last but not least please fill out our pre-review survey. This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out.

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

NickleDave commented 6 months ago

Hi @paddyroddy I'm adding the initial Editor-in-Chief checks.

The tl;dr is that we are about ready to start but I need you to do two things first:

[ ] Add a code of conduct
[ ] Add a "layperson" summary of the package to the docs page
- The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.

I will add one more comment below but it is not required that you address it before we start the review

Editor in Chief checks

Hi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins.

Please check our Python packaging guide for more information on the elements below.

[x] Installation The package can be installed from a community repository such as PyPI (preferred), and/or a community channel on conda (e.g. conda-forge, bioconda).
- [x] The package imports properly into a standard Python environment import package.
[x] Fit The package meets criteria for fit and overlap.
[ ] Documentation The package has sufficient online documentation to allow us to evaluate package function and scope without installing the package. This includes:
- [x] User-facing documentation that overviews how to install and start using the package.
- [x] Short tutorials that help a user understand how to use the package and what it can do for them.
- [x] API documentation (documentation for your code's functions, classes, methods and attributes): this includes clearly written docstrings with variables defined using a standard docstring format.
[x] Core GitHub repository Files
- [x] README The package has a README.md file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.
- [x] Contributing File The package has a CONTRIBUTING.md file that details how to install and contribute to the package.
- [ ] Code of Conduct The package has a CODE_OF_CONDUCT.md file.
- [x] License The package has an OSI approved license. NOTE: We prefer that you have development instructions in your documentation too.
[x] Issue Submission Documentation All of the information is filled out in the YAML header of the issue (located at the top of the issue template).
[x] Automated tests Package has a testing suite and is tested via a Continuous Integration service.
[x] Repository The repository link resolves correctly.
[x] Package overlap The package doesn't entirely overlap with the functionality of other packages that have already been submitted to pyOpenSci.

[x] Initial onboarding survey was filled out We appreciate each maintainer of the package filling out this survey individually. :raised_hands: Thank you authors in advance for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:

Editor comments

when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files

NickleDave commented 6 months ago

@paddyroddy related to that to-do item for the docs: my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)

Part of what we can do with this review is help you reach a broader audience. But that means we need to know where to look for editors and reviewers.

Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?

As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.

Any more concrete detail you can give me about domains to look at for editors + reviewers would be helpful. We are looking now, but I want to get your input to make sure this review helps you.

paddyroddy commented 6 months ago

The tl;dr is that we are about ready to start but I need you to do two things first:

* [ ]  Add a code of conduct

* [ ]  Add a "layperson" summary of the package to the docs page

  * The first thing someone should read on the docs is a statement of what the package does--I wouldn't assume someone will bother to click again to go to PyPI. Provide enough background that someone outside your domain will have some understanding of how the package fits into a bigger picture, kind of like the first couple of sentences of the abstract of your thesis.

I've added these. Mine isn't particularly lay, but they are a fairly specialised application, so hopefully from the description it's obvious who the target audience are!

paddyroddy commented 6 months ago

Editor comments

* when I first tested that I could import the package, it downloaded a lot of data. Many people might find this surprising. I would suggest adding data directly to the package if possible--if the files are small enough--and/or deferring download of large files

Annoyingly, I hadn't spotted this, as I cache after the first import. I've had a look and the following are created:

-rw-------    1 paddy staff  74M Nov 24 18:37 EGM2008_Topography_flms_L2190.mat
-rw-------    1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvalues.npy
-rw-------    1 paddy staff  15M Nov 24 18:37 meshes_laplacians_basis_functions_bird_b697_eigenvectors.npy
-rw-------    1 paddy staff 5.6K Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvalues.npy
-rw-------    1 paddy staff 1.1M Nov 24 18:37 meshes_laplacians_slepian_functions_bird_b697_N194_eigenvectors.npy
-rw-------    1 paddy staff  132 Nov 24 18:37 slepian_masks_africa_L1.npy
-rw-------    1 paddy staff  132 Nov 24 18:37 slepian_masks_south_america_L1.npy

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

paddyroddy commented 6 months ago

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

Fixed this in astro-informatics/sleplet#336 (v1.4.5), turns out I had introduced the bug using pydantic.Field(default= values. The data will still be downloaded as and when - but not on initial import. And, as before, it is cached so will only happen once.

paddyroddy commented 6 months ago

@paddyroddy related to that to-do item for the docs: my understanding is that the goal for SLEPLET is to provide general access to the wavelet basis/transform you have developed for incomplete data on manifolds (feel free to correct me if I'm wrong)

Yep that's right!

Can you please help me understand the main domains where you have applied and can apply your methods? I see you are an RSE in an astro group and that you mention cosmology in your thesis; in the papers you apply your methods to topographic data from Earth. Are these methods mainly used by people in cosmology, geophysics, computer imaging, all of the above? Could this work relate to geodesy in earth science?

So I'm actually a general RSE (not strictly working on this), but was formerly a PhD student in an astrophysics group, specialising in cosmology. In reality, these methods could be used by any field in which data is measured only in certain parts of a manifold. Be that: oceanography, geophysics, solar physics, astrophysics, cosmology, computer graphics, machine learning etc. In reality, there are many fields which could use methods such as these.

As examples of places where we might look for editors/reviewers: we work with astropy and software underground. I can also imagine asking people that work on manifolds in general and geometric deep learning, and another place we might look would be users/developers of packages you rely on or mention, e.g. pyssht/py2let. But my sense is that we will want domain experts that can help you translate the methods you've developed for a broader audience.

Those all sound like reasonable places. Would be very interested to have others people's insights. Sorry, I'm not sure how else to help on that front!

NickleDave commented 6 months ago

The problem (in particular) is PyPI doesn't allow over 50MB. I need to work out a way to delay the downloading of EGM2008_Topography_flms_L2190.mat.

Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch? A pattern a lot of core scientific Python packages seem to be converging on is "put the data in the package and use importlib-resources if it fits on PyPI; for everything else use pooch". See for example scikit-image: https://github.com/scikit-image/scikit-image/issues/5146 or librosa: https://github.com/librosa/librosa/tree/main/librosa/util/example_data

NickleDave commented 6 months ago

Thank you @paddyroddy for your replies about the domain, very helpful.
I am very sympathetic to developing methods that are useful across domains (FFT anyone?) but just want to make sure we're helping you reach that broader audience.

We are looking for an editor + reviewers now.

paddyroddy commented 6 months ago

Sounds like you fixed the issue overall but jic you're still figuring this out: are you familiar with pooch?

Yes, I'm using pooch...

NickleDave commented 6 months ago

Whoops, guess I could've just looked 😝 😇 We can say I'm trying to be helpful

NickleDave commented 5 months ago

Hi @paddyroddy, happy new year's 🙂 I'm happy to report that @SimonMolinsky will be the editor for this review, I'll let him take it from here. We are now actively seeking reviewers.

SimonMolinsky commented 5 months ago

Hi @paddyroddy !

From now on, I'll be responsible for the review process. :) I've looked into your package, the JOSS publication, and related publications, and everything is outstanding. I agree with @NickleDave that your package needs more usability review from potential users than a technical review. I will look for reviewers in specific communities interested in applying your algorithms in their solution.

paddyroddy commented 5 months ago

Thank you, @SimonMolinsky! Looking forward to the next steps

SimonMolinsky commented 4 months ago

@paddyroddy I'm still looking for reviewers. Hopefully, we will start the review at the beginning of next week! (We have one good soul on board, waiting for the response from other people). Thank you for your patience!

SimonMolinsky commented 4 months ago

Hi @paddyroddy , we have our reviewers. We can start a review process.

SimonMolinsky commented 4 months ago

Editor response to review:

Editor comments

:wave: Hi @magsol and @gnypit ! Thank you for volunteering to review for pyOpenSci! I'm here to help you with the review, so if you need any support, then let me know immediately! But first, look at the steps below :point_down: @magsol you may start as soon as you want; @gnypit will join us after 22nd of February!

Please fill out our pre-review survey

Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.

[ ] reviewer 1 survey completed.
[ ] reviewer 2 survey completed.

The following resources will help you complete your review:

Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
Here is the review template that you will need to fill out and submit here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due: 2024-03-17

Reviewers: Shannon Quinn, Jakub Tomasz Gnyp Due date: 2024-03-17

SimonMolinsky commented 3 months ago

Hi @magsol , @gnypit !

Please let us know how it's going :)

magsol commented 3 months ago

@SimonMolinsky Just finished the reviewer survey, and am starting to work my way through the Reviewer Guide. Hopefully will have some more detailed updates soon :) (sorry for the delay in starting-- tl;dr it's been a very chaotic past month)

magsol commented 3 months ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests:
- [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[x] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[x] A short summary describing the high-level functionality of the software
[x] Authors: A list of authors with their affiliations
[x] A statement of need clearly stating problems the software is designed to solve and its target audience.
[x] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

Somewhere in the 2-3 hours range.

Review Comments

magsol commented 2 months ago

Sorry, it was a very busy past week + weekend. I plan on finishing this review tomorrow. Apologies for the delay!

magsol commented 2 months ago

Ok, I've finished my initial review. I have to admit up front: this is largely outside my area of expertise. While I understand the high-level use case (my research involves a lot of manifold learning and walking, and may occasionally reside in non-Euclidean space), the specifics of Slepian wavelets are not familiar to me.

That said, the package itself seems extremely well put together and maintained. I was able to run the examples and tests without any* problems, and the documentation--while clearly provided for people who understand Slepian wavelets--is well organized.

The only checkbox I couldn't verify was the presence of a repostatus badge, though given that the package itself is already two minor versions more recent than this review version, it's pretty safe to say this package is actively maintained.

* The only caveat here was the CLI example in the README--one of the Zenodo downloads failed mid-stream and crashed the whole example. However, there is already an issue open about this, and it seems like a Zenodo problem, so other than investigating other hosting options for the needed example files, it's easy to say this isn't the fault of the package itself.

paddyroddy commented 2 months ago

Thanks, @magsol! I wasn't aware of repostatus.org, I have added one now in astro-informatics/sleplet#360.

magsol commented 2 months ago

I'm pretty happy with it! I recommend approval :) Thanks @paddyroddy! Great work!

SimonMolinsky commented 2 months ago

@magsol Thank you for your review! It's great to hear that SLEPLET is a well-written package :)

@paddyroddy I heard from @gnypit, and he will review the package soon.

gnypit commented 2 months ago

Dear All,

First and foremost I'd like to apologize for such a late answer. Though unprofessional and thus with no excuse, it has an explanation. I started the review in February with papers at @paddyroddy's Research Gate profile. I've been distracted from March onwards due to a combination of everything that could happen at the same time. I had to change my job, reorganise my wedding party and budget for my moving to London for the next academic year, as part of my family that was supposed to sponsor my postgraduate degree (and therefore paid a very symbolic alimony for multiple years, in spite of come court orders) started backing out on religious grounds, pressuring me to cancel my wedding altogether... It's been a very crazy couple of weeks for me, personally and professionally, taking a toll on my health as well.

Additionally, I wanted to go back to the sleplet package, as the topic itself got me interested greatly. Simon informed me that the functionalities have already been verified, which is great news! Therefore, for me as a mathematician and physicist from different niches, "user experience" and clarity of code were priorities. The overall impression is very good: the code is mainly OOP, with clear organisation of files, their names, and the names of classes, methods and functions. Though I'm no expert in this particular field, I was able to find myself in all the folders and have at least a general idea of what a given file is supposed to be used for. However, I do have some minor remarks when it comes to documentation and comments in the code. I might be wrong, as this repository is dedicated to specialists in a given field - however, from my personal experience many skilled mathematicians, physicists, and others, have very basic programming skills. Therefore guides, documentation https://github.com/astro-informatics/sleplet/blob/main/documentation/DOCUMENTATION.mdand step-by-step examples are in my view an absolute necessity and need expanding. I'm not talking about such elaborate ones, like in the case of Tensorflow, scikit-learn or Transformers, but rather a simple website, such as for PyGAD https://pygad.readthedocs.io/en/latest/ or scipy.stats. Examples could be, at this stage, in the form of Jupyter notebooks, instead of scripts with defined functions which I have found so far. It's my recommendation to create such demonstration notebooks as soon as possible. Moreover, multiple files require additional comments - even snippets in the current documentation practically don't have any. I find it a very common practice, that there are few to no comments in the majority of code and I always suggest adding more. It minimises time required to understand a given file, especially during debugging, when one wants to understand a given part of code without reading the whole file, its dependencies, and so on.

In general, global and local variables are rarely explained - "SNR" in SlepianNoiseAfrica in slepian_noise_africa.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/slepian_noise_africa.py is one of few variables (and acronyms!) that was commented on and defined for a layman, such as myself 😅 Also, _vars.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_vars.py in slepian folder is a very short code, but it definitely needs some explanation. Even if there is an accepted practice to use "phi" symbol for a given angle, not everybody has to follow it necessarily. Or, if someone is relatively new to the topic, it makes understanding the code very cumbersome, due to time and attention required to find some context in which this or that variable is used, how it's used and what precisely may it represent. Of course these variables were explained in the current documentation, but it would be more convenient to copy-paste these into the proper script or create a link/reference. Even a comment "Explained in the documentation: >link<" would suffice, come to think of that.

In "tests" codes usually have one short comment for a whole function. Same goes for files like flm.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/flm.py, fp.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/fp.py and so on - in these cases even the files' names are quite secretive. At the same time files like noise.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/noise.py have very comprehensive comments in some places - generally I've noticed that files with "methods", "coefficients" and regarding plotting are the most clearly written and I'd suggest using them as a standard for the rest of the library. Including LaTeX formulas (e.g. in squashed_gaussian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/squashed_gaussian.py, harmonic_gaussian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/harmonic_gaussian.py, etc.) was a great idea and it'd be great to expand on this while creating documentation - or at least giving some references, perhaps 🤔 Of course, some files really are very clear with just two comments, like _bool_methods.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_bool_methods.py, but they constitute a minority of the whole package. Some files have messages in them, such as slepian.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/functions/slepian.py, which increases clarity, but specific comments explaining methods like _validate_rank in Slepian class or _compute_angles in SlepianDiracDelta class are in my view preferable.

Finally, some concepts in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. I've noticed the _parallel_methods.py https://github.com/astro-informatics/sleplet/blob/main/src/sleplet/_parallel_methods.py using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully.

Overall, the biggest challenge before contributors is to create a friendly documentation and embark on a boring and tiresome adventure to add comments. There already are examples of in-depth comments, so there is a clear and present standard to strive for. Apart from this, the library is well-organised, written beautifully in OOP in Python, with some functional parts. I find it intuitive and modern. Also, introducing multiprocessing shows the author's consideration for computing cost, which is a growing necessity in today's scientific world - especially for those who have the skills but cannot afford proper hardware. Should I ever meet someone looking for tools to work with wavelets

I'll make sure with Simon to put it in a proper place as my input to the review.

With kind regards, Jakub T. Gnyp

śr., 20 mar 2024 o 20:29 Simon @.***> napisał(a):

@magsol https://github.com/magsol Thank you for your review! It's great to hear that SLEPLET is a well-written package :)

@paddyroddy https://github.com/paddyroddy I heard from @gnypit https://github.com/gnypit, and he will review the package soon.

— Reply to this email directly, view it on GitHub https://github.com/pyOpenSci/software-submission/issues/149#issuecomment-2010457477, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO4C3VRJ6FLI7UB4C5G2IDTYZHPQRAVCNFSM6AAAAAA7NMCAMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJQGQ2TONBXG4 . You are receiving this because you were mentioned.Message ID: @.***>

paddyroddy commented 2 months ago

@gnypit I'm sorry to hear about the recent events in your life. I hope things are getting back in control.

Thank you for constructive feedback regarding comments, documentation and demos.

I have tried to automate the documentation construction as much as possible, and hence it is a bit light in some places. However, just for your information, the parts in which it is documentation-light (as you say) are the sections that are private (i.e. not exposed to the users) indicated by the leading _. This does beg the question of who @pyOpenSci is aiming to, is it users, developers or both? To me, as a user of a package, it is not often I'd be digging around the internals. As a developer, it is something I might do if the situation requires it. I believe all functions (public or private) are named fairly sensibly, and have docstrings.

Regarding comments, I feel this is personal style - I tend to put them when the step is non-intuitive (for me), but I feel it can clutter the code somewhat if it is everywhere.

As for demos, I have various demos in https://github.com/astro-informatics/sleplet/tree/main/examples. Would surrounding these with explanatory prose suffice your request? I'm not a big fan of notebooks personally, but could convert the files in the examples folder to notebooks.

SimonMolinsky commented 2 months ago

Hi @gnypit , sorry to hear you have experienced such things; hopefully, your situation improves!

@gnypit Thank you for your review; just one more thing: could you copy & paste the review template into a comment in this thread and select boxes based on your review: https://www.pyopensci.org/software-peer-review/how-to/reviewer-guide.html#peer-review-template (You should be logged in to GitHub, it won't be possible from email).

@gnypit , @paddyroddy :

However, just for your information, the parts that are documentation-light (as you say) are the sections that are private (i.e., not exposed to the users) indicated by the leading _. This does beg the question of who https://github.com/pyOpenSci is aiming at: users, developers, or both?

Answering this question: docs in private functions are not required. @pyOpenSci focuses on both users and developers - as potential maintainers. Documenting private functions is nice to have but totally optional 😊

gnypit commented 1 month ago

Okay @SimonMolinsky & @paddyroddy - just in case, I'm still dabbling in the functionalities on a Linux VM, but:

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[x] Badges for:
- [x] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [x] Python versions supported,
- [x] Current package version (on PyPI / Conda).

[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information

Usability

[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[X] Performance: Any performance claims of the software been confirmed.
[X] Automated tests:
- [X] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [X] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[X] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [X] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[x] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[x] A short summary describing the high-level functionality of the software
[x] Authors: A list of authors with their affiliations
[x] A statement of need clearly stating problems the software is designed to solve and its target audience.
[x] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 30h by @gnypit + time spent by @magsol

Review Comments

The overall impression is very good: the code is mainly OOP, with clear organisation of files, their names, and the names of classes, methods and functions. The documentation and examples could use a little more prose around them or comments within them, as already discussed. However, all user-faced parts of code are very thoroughly written and the private functiones, methods, etc. are clearly indicated. Nevertheless, expanding on comments and guides/tutorials might make it easier for both users and contributors to faster and better orient themselves in the library. It doesn't change the fact, that for specialists in the field who are semi-advanced in Python the clarity of the code, intrinsic messages and equations put in LaTeX, together with the required documentation, seem sufficient.

The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.

Overall, the biggest challenge before contributors is to make the documentation a little more friendly and embark on a boring and tiresome adventure to add comments or alternatives, as already discussed with author during the review. There already are examples of in-depth comments, so there is a clear and present standard to strive for. Apart from this, the library is very well-organised, written beautifully in OOP in Python, with some functional parts. It is intuitive and modern. Also, introducing multiprocessing shows the author's consideration for computing cost, which is a growing necessity in today's scientific world - especially for those who have the skills but cannot afford proper hardware.

Finally, as a personal comment, though I'm still playing around with the functionalities, as I misunderstood the scope of JOSS review, I can already recommend approving the package and will simply let yopu know here in the comments, whenever and whatever else comes up.

paddyroddy commented 1 month ago

The one concept in programming might be very new and difficult to get used to even for semi-advanced programmers in Python, like multiprocessing. It concernes the private parts of the code, as the _parallel_methods.py is using a fork of the multiprocessing library. As there are other libraries and methods to make parallel operations in Python, also depending on the hardware one uses (e.g. CUDA), it might be a good idea to significantly expand examples and explain this part of the library in depth so that if someone uses this part of the library, it is done quite purposefully. It is, of course, merely a suggestion on what could be done as a part of future development of the library.

multiprocess is a fork of multiprocessing as you say. The reason for using it is that it uses dill rather than pickle, which makes it easier to serialise data - and much easier to work with IMO. As mentioned previously, this is in the non-public facing part of the code, so shouldn't be too important for users. However, there is definite scope for speeding up further with CUDA etc., depending on what hardware one has available.

SimonMolinsky commented 1 month ago

@gnypit thank you for your review! As I understand, we're waiting for the performance checks from your side. If you need any support with the point Packaging guidelines: The package conforms to the pyOpenSci [packaging guidelines](https://www.pyopensci.org/python-package-guide). then let me know!

gnypit commented 3 weeks ago

Hello all! As I have recommended package approval already in April, I thought that you're free to go on. I apologise for the confusion. I checked all the boxes in the review template. I triple-checked some minor performance issues that I had; they're caused by problems with my virtual machine and not by the code itself.

@paddyroddy coming back to the multiprocessing, my starting the subject was motivated only by giving some suggestions. Of course it's a private part of the code, so it's a secondary matter for now.

Is there anything else that's required of me @SimonMolinsky ?

SimonMolinsky commented 3 weeks ago

@gnypit Thank you!

@paddyroddy As I see, there are no other changes required. Thus, I'm going to approve the package!

SimonMolinsky commented 3 weeks ago

@paddyroddy Congratulations, SLEPLET has become officially a part of the pyOpenSci family! Thank you for submitting your package here. I will let you know when it will be updated on the pyOS website as soon as it happens :)

@gnypit & @magsol, thank you for your support and time!

paddyroddy commented 3 weeks ago

@paddyroddy coming back to the multiprocessing, my starting the subject was motivated only by giving some suggestions. Of course it's a private part of the code, so it's a secondary matter for now.

Thank you for your comments and review @gnypit!

paddyroddy commented 3 weeks ago

@paddyroddy Congratulations, SLEPLET has become officially a part of the pyOpenSci family! Thank you for submitting your package here. I will let you know when it will be updated on the pyOS website as soon as it happens :)

Great, thank you. Excited to have this recognition!

SimonMolinsky commented 1 week ago

@paddyroddy Thank you again. We are wrapping things up to close the review officially. I've seen on PyPI that you added the pyOS badge to your package. Please fill out the the post-review survey in your spare time :)

And again, you might be interested in writing a blog post about your package. Here are examples and the template:

pandera movingpandas

TEMPLATE: here is a markdown example that you could use as a guide when creating your post.

Last but not least, we are growing fast and looking for editors, reviewers, and experts in their fields. Your knowledge could be invaluable to the organization. So let me know if you would like to join the pyOpenSci community! :slightly_smiling_face:

And something that does not require your input: I've opened a thread about your package on the pyOpenSci Discourse. Your package will soon be visible on the pyOpenSci website, Pull Request with metadata is pending :hourglass:

🎉 SLEPLET has been approved by pyOpenSci! Thank you @paddyroddy for submitting SLEPLET, and many thanks to @magsol and @gnypit for reviewing this package! 😸

Author Wrap Up Tasks

There are a few things left to do to wrap up this submission:

[x] Activate Zenodo watching the repo if you haven't already done so.
[x] Tag and create a release to create a Zenodo version and DOI.
[x] Add the badge for pyOpenSci peer-review to the README.md of SLEPLET. The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-submission/issues/149).
[x] Please fill out the post-review survey. All maintainers and reviewers should fill this out.

Editor Final Checks

Please complete the final steps to wrap up this review. Editor, please do the following:

[x] Make sure that the maintainers filled out the post-review survey
[x] Invite the maintainers to submit a blog post highlighting their package. Feel free to use / adapt language found in this comment to help guide the author.
[x] Change the status tag of the issue to 6/pyOS-approved6 🚀🚀🚀.
[x] Invite the package maintainer(s) and both reviewers to slack if they wish to join.
[x] If the author submits to JOSS, please continue to update the labels for JOSS on this issue until the author is accepted (do not remove the 6/pyOS-approved label). Once accepted add the label 9/joss-approved to the issue. Skip this check if the package is not submitted to JOSS.
[x] If the package is JOSS-accepted please add the JOSS doi to the YAML at the top of the issue.

If you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the peer-review-guide.

paddyroddy commented 1 week ago

Thanks, @SimonMolinsky! I've filled in the survey, and will look into the blog post.

paddyroddy commented 1 week ago

@SimonMolinsky I've made a PR for the blog post https://github.com/pyOpenSci/pyopensci.github.io/pull/412

paddyroddy commented 1 week ago

@SimonMolinsky looks like it's on the website now so this issue can be closed?

SimonMolinsky commented 1 week ago

@paddyroddy Yes, you are right! Thanks! I'm closing the issue :)

pyOpenSci / software-submission