[REVIEW]: swyft: Truncated Marginal Neural Ratio Estimation in Python

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@bkmi@pdebuyl<!--end-editor-- Reviewers: @mattpitkin, @olgadoronina Archive: 10.5281/zenodo.6412465

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22"><img src="https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22/status.svg)](https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@mattpitkin & @olgadoronina, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @pdebuyl know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @mattpitkin

📝 Checklist for @olgadoronina

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 2 years ago

Checking the BibTeX entries failed with the following error:

No paper file path

editorialbot commented 2 years ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.28 s (340.3 files/s, 53509.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          47           1185           1115           5292
Jupyter Notebook                14              0           4630            968
reStructuredText                22            250            310            683
YAML                             5             20             28            108
Markdown                         1             27              0             65
TeX                              2              1              0             32
DOS Batch                        1              8              1             26
make                             1              4              6             10
TOML                             1              2              0              8
-------------------------------------------------------------------------------
SUM:                            94           1497           6090           7192
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 2 years ago

Failed to discover a Statement of need section in paper

editorialbot commented 2 years ago

:warning: An error happened when generating the pdf.

pdebuyl commented 2 years ago

@editorialbot set joss-submission-do-not-delete as branch

editorialbot commented 2 years ago

Done! branch is now joss-submission-do-not-delete

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

Hi @mattpitkin @olgadoronina please read full the instructions at https://github.com/openjournals/joss-reviews/issues/4205#issue-1150402557 , they contain all the necessary information. I remain available in any case.

mattpitkin commented 2 years ago

Review checklist for @mattpitkin

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/undark-lab/swyft?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@bkmi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

mattpitkin commented 2 years ago

@bkmi I've just installed version 0.2.1 of swyft from PyPI. I'm trying to run the Quickstart guide, but on line:

marginal_indices_1d, marginal_indices_2d = swyft.utils.get_corner_marginal_indices(n_parameters)

it gives the error:

AttributeError: module 'swyft.utils' has no attribute 'get_corner_marginal_indices'

I assume this is because v0.2.1 isn't up-to-date. Could you push the latest release to PyPI?

mattpitkin commented 2 years ago

The paper is very nice. I only have a minor suggestion, which is due to my gravitational-wave bias. Seeing as rapid inference for gravitational waves is mentioned, it might be worth citing some of the related work in, e.g., Gabbard et al, Dax et al, Chua & Vallisneri and (a paper including one the co-authors of swyft) Delaunay et al.

bkmi commented 2 years ago

Ah yes. The newest version, the one intended for review, is just a release candidate. I will update it asap on pypi and on github. Then the documentation will also correspond to the easily downloadable version.

Good idea on the citations.

bkmi commented 2 years ago

the pip install swyft should now be addressed and the user would be able to access the documentation and examples in notebooks.

I still need to make the changes w.r.t. the citations!

mattpitkin commented 2 years ago

Great, thanks very much. I'll retry running some of the examples tomorrow and expect to be able to sign-off shortly.

olgadoronina commented 2 years ago

Review checklist for @olgadoronina

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/undark-lab/swyft?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@bkmi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

bkmi commented 2 years ago

I added those citations!

p.s. due to some release issues the version that should be reviewed is actually v0.3.1.post0.

@editorialbot generate pdf

mattpitkin commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mattpitkin commented 2 years ago

@bkmi I see that the version generation is now via setuptools_scm, which is great. I can get the version number by doing:

from swyft.__version__ import version
print(version)

However, it would be great if the version number could be obtained just by using:

import swyft
print(swyft.__version__)

You can achieve this by adding the following to the swyft/__init__.py file:

try:
    from .__version__ import version as __version__
except ModuleNotFoundError:
    __version__ = ""

bkmi commented 2 years ago

Nice tip! I didn't know about that and I will use it in all my future projects!

I have made the change (along with fixing a from swyft import * bug which I found) in this commit. Would you like me to make an new release for it or should I wait until I incorporate other changes from this review process?

mattpitkin commented 2 years ago

I have successfully installed and run several of the examples and confirmed that I achieve consistent results. I'm happy to sign-off swyft as reviewed.

I have a couple of questions unrelated to the review, one's a stylistic one and the other's more scientific:

would it be possible in the violin plots to pass in the parameter names as labels? From looking at the hist1d code this looks possible, but not necessarily for the violin plots.
(I should probably read the related papers, but...) given that swyft approximates the likelihood to evidence ratio, it is possible in any way to extract the evidence value too?

pdebuyl commented 2 years ago

Thanks @mattpitkin for the review!

bkmi commented 2 years ago

I have successfully installed and run several of the examples and confirmed that I achieve consistent results. I'm happy to sign-off swyft as reviewed.

Thanks for donating your time and expertise!

I have a couple of questions unrelated to the review, one's a stylistic one and the other's more scientific:

would it be possible in the violin plots to pass in the parameter names as labels? From looking at the hist1d code this looks possible, but not necessarily for the violin plots.

This is a good point! I will make an issue and include this in the next release.

(I should probably read the related papers, but...) given that swyft approximates the likelihood to evidence ratio, it is possible in any way to extract the evidence value too?

It isn't really obvious how to do this to me, except through some kind of hacks... If you have the likelihood for a problem (swyft aims to assist in problems in which the likelihood is unknown), in principle you can estimate the likelihood-to-evidence ratio and then compute likelihood * (1 / likelihood-to-evidence ratio). This is actually done as a way to show that the method works in https://arxiv.org/abs/1903.04057, but I wouldn't necessarily trust it for other problems. If you have the likelihood, you can also use nested sampling methods which are perhaps more trustworthy.

bkmi commented 2 years ago

I implemented the change w.r.t. labels!

bkmi commented 2 years ago

I thought I'd just check in and see how things were going with the review @olgadoronina! Any updates?

olgadoronina commented 2 years ago

Hi!, sorry for the delay, I will complete the review as soon as possible

pdebuyl commented 2 years ago

@olgadoronina any news?

olgadoronina commented 2 years ago

@bkmi, I'm really sorry for the delay, I installed and ran tests and examples. I also used my own simple model based on the provided examples. Everything looks good. The package was easy to use and provided examples were really easy to follow. I have completed my review.

bkmi commented 2 years ago

Thank you for the review @olgadoronina!

pdebuyl commented 2 years ago

Thank you @olgadoronina for the review!

@bkmi I will proceed to the editorial steps in the coming days, see https://joss.readthedocs.io/en/latest/editing.html#after-reviewers-recommend-acceptance

bkmi commented 2 years ago

@editorialbot set 10.5281/zenodo.6412465 as archive

editorialbot commented 2 years ago

I'm sorry @bkmi, I'm afraid I can't do that. That's something only editors are allowed to do.

bkmi commented 2 years ago

Oh, oops. Sorry I thought that I was supposed to do that!

The most recent archive is 10.5281/zenodo.6412465 The version is v0.3.2

That is the version / archive which corresponds to all changes requested in this thread. (swyft.__version and violin plot marginal names)

pdebuyl commented 2 years ago

@editorialbot set 10.5281/zenodo.6412465 as archive

editorialbot commented 2 years ago

Done! Archive is now 10.5281/zenodo.6412465

pdebuyl commented 2 years ago

@editorialbot set v0.3.2 as version

editorialbot commented 2 years ago

Done! version is now v0.3.2

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

Minor comments

In the notebook Examples - 2. Truncation I had to re-run the cell

store.add(n_training_samples + 0.01 * n_training_samples, prior, bound=bound)
store.simulate()
dataset = swyft.Dataset(n_training_samples, prior, store, bound = bound)

twice as else there would not be enough samples. I don't know if it is unexpected though as the sampling is not deterministic.

In the article, there is no section titled "Statement of Need". This is a mandatory part of the article according to JOSS' guidelines. The current section "Motivation" is a good fit for it though, you could simply rename it.
There is mention that the technique would not be convenient to include in existing ML packages. Would it be a possibility in the future, provided some changes are made to an hypothetical good package for inclusion? (PS: not pushing for it, but I think that it is interesting to let the readers -- possibly maintainers of such a package -- what prevents it from happening).
It is convenient that "swyft automates irksome matters like distributed computing and data storage". What does it mean in practice? Is it by using dask? In that case, it could be mentioned explicitly after the quoted sentence.
To download the example notebooks, I cloned the repo. It is probably not much of an issue for the developers of swyft but for prospective users a "download this notebook" link would be great (-> this is only a suggestion).

PS: in case of further updates to the article file, this does not require a new zenodo archive or update to the version.

pdebuyl commented 2 years ago

@bkmi Hi, any update on your side?

bkmi commented 2 years ago

Hey there, thank you for your patience.

I have been at a conference, vacation, then working on another paper deadline.

This is on my list but it may take me a little while to implement the changes. Is that okay?

pdebuyl commented 2 years ago

Hi @bkmi no worry. The reviewers are done, so it is only me and you waiting.

bkmi commented 2 years ago

Minor comments

In the notebook Examples - 2. Truncation I had to re-run the cell
store.add(n_training_samples + 0.01 * n_training_samples, prior, bound=bound)
store.simulate()
dataset = swyft.Dataset(n_training_samples, prior, store, bound = bound)
twice as else there would not be enough samples. I don't know if it is unexpected though as the sampling is not deterministic.

I added a line which noted the non-deterministic nature of the number of samples drawn (poisson distribution) and increased the factor of extra samples which are produced. This should solve the issue.

In the article, there is no section titled "Statement of Need". This is a mandatory part of the article according to JOSS' guidelines. The current section "Motivation" is a good fit for it though, you could simply rename it.

Done!

There is mention that the technique would not be convenient to include in existing ML packages. Would it be a possibility in the future, provided some changes are made to an hypothetical good package for inclusion? (PS: not pushing for it, but I think that it is interesting to let the readers -- possibly maintainers of such a package -- what prevents it from happening).

I added a sentence about how exactly we overcame the issues, thereby clarifying what stands in the way of including TMNRE in their packages in the future.

The sentence reads:

`swyft` does the parallel training of the ratio using another dimension in a PyTorch tensor and created a custom truncated prior data structure to overcome these challenges.

It is convenient that "swyft automates irksome matters like distributed computing and data storage". What does it mean in practice? Is it by using dask? In that case, it could be mentioned explicitly after the quoted sentence.

Indeed, I made the change.

To download the example notebooks, I cloned the repo. It is probably not much of an issue for the developers of swyft but for prospective users a "download this notebook" link would be great (-> this is only a suggestion).

I made the change! you can see it here https://swyft.readthedocs.io/en/latest/notebooks/Quickstart.html

PS: in case of further updates to the article file, this does not require a new zenodo archive or update to the version.

Okay I did change the code as requested so I can make a .post0 release if you think that is appropriate. Please let me know!

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1038/s41586-020-2649-2 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.21105/joss.03021 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.5281/zenodo.3509134 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.5281/zenodo.5712786 is OK
- 10.21105/joss.02505 is OK
- 10.21105/joss.00011 is OK
- 10.1093/bioinformatics/bty361 is OK
- 10.1145/3093172.3093233 is OK
- 10.1093/mnras/stz1960 is OK
- 10.1093/mnras/sty819 is OK
- 10.1093/mnras/stz1900 is OK
- 10.1002/sta4.56 is OK
- 10.1214/20-ba1238 is OK
- 10.1007/s11222-017-9738-6 is OK
- 10.1214/06-ba127 is OK
- 10.1111/j.1365-2966.2007.12353.x is OK
- 10.1111/j.1365-2966.2009.14548.x is OK
- 10.1093/mnras/stv1911 is OK
- 10.1051/0004-6361/201322971 is OK
- 10.1007/s11222-018-9844-0 is OK
- 10.1214/18-ss120 is OK
- 10.1098/rsif.2008.0172 is OK
- 10.1111/j.2517-6161.1984.tb01290.x is OK
- 10.1214/aos/1176346785 is OK
- 10.1093/genetics/145.2.505 is OK
- 10.1007/s11222-009-9116-0 is OK
- 10.2172/4390578 is OK
- 10.1093/biomet/57.1.97 is OK
- 10.1038/s41567-021-01425-7 is OK
- 10.1103/PhysRevLett.127.241103 is OK
- 10.1103/PhysRevLett.124.041102 is OK

MISSING DOIs

- None

INVALID DOIs

- None

openjournals / joss-reviews