openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
700 stars 36 forks source link

[REVIEW]: swyft: Truncated Marginal Neural Ratio Estimation in Python #4205

Closed editorialbot closed 2 years ago

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@bkmi<!--end-author-handle-- (Benjamin Miller) Repository: https://github.com/undark-lab/swyft Branch with paper.md (empty if default branch): joss-submission-do-not-delete Version: v0.3.2 Editor: !--editor-->@pdebuyl<!--end-editor-- Reviewers: @mattpitkin, @olgadoronina Archive: 10.5281/zenodo.6412465

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22"><img src="https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22/status.svg)](https://joss.theoj.org/papers/2dafe33336397a6d5dd8f3c692eaab22)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@mattpitkin & @olgadoronina, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @pdebuyl know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @mattpitkin

📝 Checklist for @olgadoronina

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 years ago

Checking the BibTeX entries failed with the following error:

No paper file path
editorialbot commented 2 years ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.28 s (340.3 files/s, 53509.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          47           1185           1115           5292
Jupyter Notebook                14              0           4630            968
reStructuredText                22            250            310            683
YAML                             5             20             28            108
Markdown                         1             27              0             65
TeX                              2              1              0             32
DOS Batch                        1              8              1             26
make                             1              4              6             10
TOML                             1              2              0              8
-------------------------------------------------------------------------------
SUM:                            94           1497           6090           7192
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 2 years ago

Failed to discover a Statement of need section in paper

editorialbot commented 2 years ago

:warning: An error happened when generating the pdf.

pdebuyl commented 2 years ago

@editorialbot set joss-submission-do-not-delete as branch

editorialbot commented 2 years ago

Done! branch is now joss-submission-do-not-delete

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

Hi @mattpitkin @olgadoronina please read full the instructions at https://github.com/openjournals/joss-reviews/issues/4205#issue-1150402557 , they contain all the necessary information. I remain available in any case.

mattpitkin commented 2 years ago

Review checklist for @mattpitkin

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

mattpitkin commented 2 years ago

@bkmi I've just installed version 0.2.1 of swyft from PyPI. I'm trying to run the Quickstart guide, but on line:

marginal_indices_1d, marginal_indices_2d = swyft.utils.get_corner_marginal_indices(n_parameters)

it gives the error:

AttributeError: module 'swyft.utils' has no attribute 'get_corner_marginal_indices'

I assume this is because v0.2.1 isn't up-to-date. Could you push the latest release to PyPI?

mattpitkin commented 2 years ago

The paper is very nice. I only have a minor suggestion, which is due to my gravitational-wave bias. Seeing as rapid inference for gravitational waves is mentioned, it might be worth citing some of the related work in, e.g., Gabbard et al, Dax et al, Chua & Vallisneri and (a paper including one the co-authors of swyft) Delaunay et al.

bkmi commented 2 years ago

Ah yes. The newest version, the one intended for review, is just a release candidate. I will update it asap on pypi and on github. Then the documentation will also correspond to the easily downloadable version.

Good idea on the citations.

bkmi commented 2 years ago

the pip install swyft should now be addressed and the user would be able to access the documentation and examples in notebooks.

I still need to make the changes w.r.t. the citations!

mattpitkin commented 2 years ago

Great, thanks very much. I'll retry running some of the examples tomorrow and expect to be able to sign-off shortly.

olgadoronina commented 2 years ago

Review checklist for @olgadoronina

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

bkmi commented 2 years ago

I added those citations!

p.s. due to some release issues the version that should be reviewed is actually v0.3.1.post0.

@editorialbot generate pdf

mattpitkin commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mattpitkin commented 2 years ago

@bkmi I see that the version generation is now via setuptools_scm, which is great. I can get the version number by doing:

from swyft.__version__ import version
print(version)

However, it would be great if the version number could be obtained just by using:

import swyft
print(swyft.__version__)

You can achieve this by adding the following to the swyft/__init__.py file:

try:
    from .__version__ import version as __version__
except ModuleNotFoundError:
    __version__ = ""
bkmi commented 2 years ago

Nice tip! I didn't know about that and I will use it in all my future projects!

I have made the change (along with fixing a from swyft import * bug which I found) in this commit. Would you like me to make an new release for it or should I wait until I incorporate other changes from this review process?

mattpitkin commented 2 years ago

I have successfully installed and run several of the examples and confirmed that I achieve consistent results. I'm happy to sign-off swyft as reviewed.

I have a couple of questions unrelated to the review, one's a stylistic one and the other's more scientific:

pdebuyl commented 2 years ago

Thanks @mattpitkin for the review!

bkmi commented 2 years ago

I have successfully installed and run several of the examples and confirmed that I achieve consistent results. I'm happy to sign-off swyft as reviewed.

Thanks for donating your time and expertise!

I have a couple of questions unrelated to the review, one's a stylistic one and the other's more scientific:

  • would it be possible in the violin plots to pass in the parameter names as labels? From looking at the hist1d code this looks possible, but not necessarily for the violin plots.

This is a good point! I will make an issue and include this in the next release.

  • (I should probably read the related papers, but...) given that swyft approximates the likelihood to evidence ratio, it is possible in any way to extract the evidence value too?

It isn't really obvious how to do this to me, except through some kind of hacks... If you have the likelihood for a problem (swyft aims to assist in problems in which the likelihood is unknown), in principle you can estimate the likelihood-to-evidence ratio and then compute likelihood * (1 / likelihood-to-evidence ratio). This is actually done as a way to show that the method works in https://arxiv.org/abs/1903.04057, but I wouldn't necessarily trust it for other problems. If you have the likelihood, you can also use nested sampling methods which are perhaps more trustworthy.

bkmi commented 2 years ago

I implemented the change w.r.t. labels!

bkmi commented 2 years ago

I thought I'd just check in and see how things were going with the review @olgadoronina! Any updates?

olgadoronina commented 2 years ago

Hi!, sorry for the delay, I will complete the review as soon as possible

pdebuyl commented 2 years ago

@olgadoronina any news?

olgadoronina commented 2 years ago

@bkmi, I'm really sorry for the delay, I installed and ran tests and examples. I also used my own simple model based on the provided examples. Everything looks good. The package was easy to use and provided examples were really easy to follow. I have completed my review.

bkmi commented 2 years ago

Thank you for the review @olgadoronina!

pdebuyl commented 2 years ago

Thank you @olgadoronina for the review!

@bkmi I will proceed to the editorial steps in the coming days, see https://joss.readthedocs.io/en/latest/editing.html#after-reviewers-recommend-acceptance

bkmi commented 2 years ago

@editorialbot set 10.5281/zenodo.6412465 as archive

editorialbot commented 2 years ago

I'm sorry @bkmi, I'm afraid I can't do that. That's something only editors are allowed to do.

bkmi commented 2 years ago

Oh, oops. Sorry I thought that I was supposed to do that!

The most recent archive is 10.5281/zenodo.6412465 The version is v0.3.2

That is the version / archive which corresponds to all changes requested in this thread. (swyft.__version and violin plot marginal names)

pdebuyl commented 2 years ago

@editorialbot set 10.5281/zenodo.6412465 as archive

editorialbot commented 2 years ago

Done! Archive is now 10.5281/zenodo.6412465

pdebuyl commented 2 years ago

@editorialbot set v0.3.2 as version

editorialbot commented 2 years ago

Done! version is now v0.3.2

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

Minor comments

  1. In the notebook Examples - 2. Truncation I had to re-run the cell
store.add(n_training_samples + 0.01 * n_training_samples, prior, bound=bound)
store.simulate()
dataset = swyft.Dataset(n_training_samples, prior, store, bound = bound)

twice as else there would not be enough samples. I don't know if it is unexpected though as the sampling is not deterministic.

  1. In the article, there is no section titled "Statement of Need". This is a mandatory part of the article according to JOSS' guidelines. The current section "Motivation" is a good fit for it though, you could simply rename it.

  2. There is mention that the technique would not be convenient to include in existing ML packages. Would it be a possibility in the future, provided some changes are made to an hypothetical good package for inclusion? (PS: not pushing for it, but I think that it is interesting to let the readers -- possibly maintainers of such a package -- what prevents it from happening).

  3. It is convenient that "swyft automates irksome matters like distributed computing and data storage". What does it mean in practice? Is it by using dask? In that case, it could be mentioned explicitly after the quoted sentence.

  4. To download the example notebooks, I cloned the repo. It is probably not much of an issue for the developers of swyft but for prospective users a "download this notebook" link would be great (-> this is only a suggestion).

PS: in case of further updates to the article file, this does not require a new zenodo archive or update to the version.

pdebuyl commented 2 years ago

@bkmi Hi, any update on your side?

bkmi commented 2 years ago

Hey there, thank you for your patience.

I have been at a conference, vacation, then working on another paper deadline.

This is on my list but it may take me a little while to implement the changes. Is that okay?

pdebuyl commented 2 years ago

Hi @bkmi no worry. The reviewers are done, so it is only me and you waiting.

bkmi commented 2 years ago

Minor comments

  1. In the notebook Examples - 2. Truncation I had to re-run the cell
store.add(n_training_samples + 0.01 * n_training_samples, prior, bound=bound)
store.simulate()
dataset = swyft.Dataset(n_training_samples, prior, store, bound = bound)

twice as else there would not be enough samples. I don't know if it is unexpected though as the sampling is not deterministic.

I added a line which noted the non-deterministic nature of the number of samples drawn (poisson distribution) and increased the factor of extra samples which are produced. This should solve the issue.

  1. In the article, there is no section titled "Statement of Need". This is a mandatory part of the article according to JOSS' guidelines. The current section "Motivation" is a good fit for it though, you could simply rename it.

Done!

  1. There is mention that the technique would not be convenient to include in existing ML packages. Would it be a possibility in the future, provided some changes are made to an hypothetical good package for inclusion? (PS: not pushing for it, but I think that it is interesting to let the readers -- possibly maintainers of such a package -- what prevents it from happening).

I added a sentence about how exactly we overcame the issues, thereby clarifying what stands in the way of including TMNRE in their packages in the future.

The sentence reads:

`swyft` does the parallel training of the ratio using another dimension in a PyTorch tensor and created a custom truncated prior data structure to overcome these challenges.
  1. It is convenient that "swyft automates irksome matters like distributed computing and data storage". What does it mean in practice? Is it by using dask? In that case, it could be mentioned explicitly after the quoted sentence.

Indeed, I made the change.

  1. To download the example notebooks, I cloned the repo. It is probably not much of an issue for the developers of swyft but for prospective users a "download this notebook" link would be great (-> this is only a suggestion).

I made the change! you can see it here https://swyft.readthedocs.io/en/latest/notebooks/Quickstart.html

PS: in case of further updates to the article file, this does not require a new zenodo archive or update to the version.

Okay I did change the code as requested so I can make a .post0 release if you think that is appropriate. Please let me know!

pdebuyl commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pdebuyl commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1038/s41586-020-2649-2 is OK
- 10.1038/s41592-019-0686-2 is OK
- 10.21105/joss.03021 is OK
- 10.1109/MCSE.2007.55 is OK
- 10.5281/zenodo.3509134 is OK
- 10.25080/Majora-92bf1922-00a is OK
- 10.5281/zenodo.5712786 is OK
- 10.21105/joss.02505 is OK
- 10.21105/joss.00011 is OK
- 10.1093/bioinformatics/bty361 is OK
- 10.1145/3093172.3093233 is OK
- 10.1093/mnras/stz1960 is OK
- 10.1093/mnras/sty819 is OK
- 10.1093/mnras/stz1900 is OK
- 10.1002/sta4.56 is OK
- 10.1214/20-ba1238 is OK
- 10.1007/s11222-017-9738-6 is OK
- 10.1214/06-ba127 is OK
- 10.1111/j.1365-2966.2007.12353.x is OK
- 10.1111/j.1365-2966.2009.14548.x is OK
- 10.1093/mnras/stv1911 is OK
- 10.1051/0004-6361/201322971 is OK
- 10.1007/s11222-018-9844-0 is OK
- 10.1214/18-ss120 is OK
- 10.1098/rsif.2008.0172 is OK
- 10.1111/j.2517-6161.1984.tb01290.x is OK
- 10.1214/aos/1176346785 is OK
- 10.1093/genetics/145.2.505 is OK
- 10.1007/s11222-009-9116-0 is OK
- 10.2172/4390578 is OK
- 10.1093/biomet/57.1.97 is OK
- 10.1038/s41567-021-01425-7 is OK
- 10.1103/PhysRevLett.127.241103 is OK
- 10.1103/PhysRevLett.124.041102 is OK

MISSING DOIs

- None

INVALID DOIs

- None