openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: The sspm R package: spatial surplus production models for the management of northern shrimp fisheries #4724

Closed editorialbot closed 1 year ago

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@VLucet<!--end-author-handle-- (Valentin Lucet) Repository: https://github.com/pedersen-fisheries-lab/sspm Branch with paper.md (empty if default branch): main Version: v1.0.0 Editor: !--editor-->@Bisaloo<!--end-editor-- Reviewers: @quang-huynh, @kellijohnson-NOAA Archive: 10.5281/zenodo.8015102

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/d05fcbbc7ff3d1d2bc3c56466f2e21e5"><img src="https://joss.theoj.org/papers/d05fcbbc7ff3d1d2bc3c56466f2e21e5/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/d05fcbbc7ff3d1d2bc3c56466f2e21e5/status.svg)](https://joss.theoj.org/papers/d05fcbbc7ff3d1d2bc3c56466f2e21e5)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@quang-huynh & @kellijohnson-NOAA, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @Bisaloo know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @quang-huynh

📝 Checklist for @kellijohnson-NOAA

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 years ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.05 s (1543.4 files/s, 218675.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
R                               55           1767           2033           5118
TeX                              1             24              0            257
Markdown                         5             53              0            155
YAML                             6             28              7            136
Rmd                              3             81            143            116
-------------------------------------------------------------------------------
SUM:                            70           1953           2183           5782
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 2 years ago

Wordcount for paper.md is 1513

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1111/1365-2664.12664 is OK
- 10.1093/icesjms/fsz048 is OK
- 10.1093/icesjms/fsv229 is OK
- 10.1890/14-0739.1 is OK
- 10.1093/icesjms/fsw230 is OK
- 10.1111/j.1467-2979.2011.00452.x is OK
- 10.1146/annurev.ecolsys.39.110707.173406 is OK
- 10.1111/faf.12550 is OK
- 10.1139/F08-170 is OK
- 10.1139/f03-105 is OK
- 10.1890/14-0739.1 is OK
- 10.1111/j.1467-2979.2012.00488.x is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

Bisaloo commented 2 years ago

:wave: :wave: :wave: @VLucet, @quang-huynh, @kellijohnson-NOAA this is the review thread for the paper. All of our communications will happen here from now on.

As a reviewer, the first step is to create a checklist for your review by entering

@editorialbot generate my checklist

as the top of a new comment in this thread.

These checklists contain the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. The first comment in this thread also contains links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#4724 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use EditorialBot (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@bisaloo) if you have any questions/concerns.

quang-huynh commented 2 years ago

@editorialbot generate my checklist

Bisaloo commented 2 years ago

@quang-huynh, could you try re-running this command please? Something went wrong on our end.

quang-huynh commented 2 years ago

Review checklist for @quang-huynh

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

quang-huynh commented 2 years ago

I reviewed the software by installing the package on my local desktop and going through the example in the An_example_with_simulated_data vignette. I was able to replicate the steps in the vignette on my R console and replicate the figures.

Documentation suggestions:

I noticed that many figures in the vignette are too small, so the figure width needs to be wider in the Rmarkdown document.

I also see that the smoothened density estimates (step 8) are identical in all four panels. Is this correct? Do P1-4 represent SFA?

The package is well organized for those familiar with S4 class system. I would recommend reducing the number of methods for generics whenever possible but this is not a major concern. For example, spm_as_dataset has 4 defined methods but two should suffice (for data.frame and sf).

Are model diagnostic tools available to evaluate goodness of fit? I see that sspm_model_fit@fit is a bam object with various methods already defined in the mgcv and stats packages. The vignette can be updated to indicate where the model object is located and what diagnostics are available.

A summary method for the sspm_fit S4 class would be very useful for reporting estimates of biomass and productivity (both by time/area), and all other outputs that would be important to advise management.

Functionality documentation The help documentation can be improved without much additional effort. For example, the help for spm_as_boundary says that the function creates a sspm_boundary object which is not helpful. The function divides polygons into smaller patches. An additional sentence to describe what these functions actually do should be added.

Community guidelines A quick statement in the README.md to recommending users to report bugs in Github issues will meet this requirement.

Paper suggestions:

State of the field There's not much reference to similar software. There are various software packages, e.g., spict, aspic, jabba, for surplus production models that estimate intrinsic rate of increase and carrying capacity. These model a unit stock without spatial considerations. However, sspm does not appear to estimate the inherent productivity of a population. Rather, sspm appears to use GAMs to estimate surplus production and productivity (C_y + B_y)/C_y-1 over space and time, as well as with any covariates provided to the model (I am not currently able to fully review the methodology underlying the package as I don't have access to the Pedersen et al. Res Doc).

In a fisheries context, sspm reminds me more of the VAST and sdmTMB packages that implement spatiotemporal GLMMs to predict response variables across a spatial field over time. It may be possible to somewhat replicate sspm in sdmTMB, but estimation in sspm will be much, much faster (compared to the use of Gaussian Markov random fields in the other packages), and sspm conveniently addresses a specific use-case for fisheries assessment. I recommend reference and comparison of these packages with sspm.

References The Pedersen et al and Prager reference needs to be edited to meet this requirement.

Other paper suggestions:

Line 13: Fisheries managers typically do not use analytical methods but receive advice from analysts. Perhaps "fisheries analysts" should be used instead of 'fisheries managers'? Line 50: Suggest "Advice to managers advice should account for varying productivity" Line 79: Should be "surplus production"

kellijohnson-NOAA commented 2 years ago

Review checklist for @kellijohnson-NOAA

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Bisaloo commented 2 years ago

Thanks a lot @kellijohnson-NOAA & @quang-huynh for your reviews!

@VLucet, do you have a timeline of when you could be able to address these comments?

VLucet commented 1 year ago

Thanks a lot to @kellijohnson-NOAA and @quang-huynh for your thoughtful reviews. Yes, my apologies for the delay, these comments will be addressed by the end of ~the month~ the year.

VLucet commented 1 year ago

Hi all, my apologies for underestimating the amount of time it is taking me to address all the comments. I wanted to thank again @kellijohnson-NOAA and @quang-huynh for the detailed and helpful comments (the list of tasks they have given and the linked commits for addressing them are here: https://github.com/pedersen-fisheries-lab/sspm/issues/54 and https://github.com/pedersen-fisheries-lab/sspm/issues/59). All of @kellijohnson-NOAA's have been addressed.

@quang-huynh I also wanted to let you know that the research document is finally out here, in case there is more you wanted to add to your comment which mentions being unable to access it yet: https://github.com/pedersen-fisheries-lab/sspm/issues/125)

quang-huynh commented 1 year ago

@VLucet Thanks for the link to the research document. The closest alternative to sspm appears to be VAST for spatial surplus production modeling: https://james-thorson-noaa.github.io/docs/tutorials/surplus-production/ The associated paper is https://doi.org/10.1111/1365-2664.12664 which the authors could use for comparison

The major benefit of sspm over VAST appears to be run time, the longer model runtime is VAST is associated with estimating the spatial random field. I've also found a big learning curve is needed with VAST. sspm addresses a much narrower use case which makes it easier to learn for a new user.

Bisaloo commented 1 year ago

Hi @VLucet, happy new year! Do you have an updated timeline of when you could go over the last remaining review comments? :relaxed:

VLucet commented 1 year ago

Hi @Bisaloo ! Happy new year as well. @eric-pedersen and I have been discussing how to address the last couple of comment relating to the text (https://github.com/pedersen-fisheries-lab/sspm/issues/125 and https://github.com/pedersen-fisheries-lab/sspm/issues/126). We need a little more time to finalize that. All the other comments and technical issues have been addressed in https://github.com/pedersen-fisheries-lab/sspm/pull/58.

Bisaloo commented 1 year ago

Hi @VLucet, what is the status of your submission please? Do you have an estimated date where you aim at finalizing your updates. It would be great to not wait too much as to make sure your work is still fresh in the reviewer's memory.

eric-pedersen commented 1 year ago

Hi folks,

@VLucet is just waiting on me to finish up some text. Apologies for the delay on this; teaching has been much more hectic than I anticipated this semester. I'm aiming to have my section finished for Val's review this week though.

Bisaloo commented 1 year ago

Hi @VLucet @eric-pedersen, any progress on this?

Bisaloo commented 1 year ago

Hi @VLucet @eric-pedersen, would it be possible for you to submit you last changes by the end of May? Otherwise, we should withdraw this paper for now and resubmit later when comments have been addressed so reviewers and myself can officially remove this from our radar and focus on other submissions.

VLucet commented 1 year ago

Hi @Bisaloo, sorry again for those delays. @eric-pedersen has finished addressing the last comments today. I am away on vacation but at my return on the 15th I will complete the submission. Thanks again for your patience.

Bisaloo commented 1 year ago

@editorialbot remind @VLucet in one week

editorialbot commented 1 year ago

@VLucet doesn't seem to be a reviewer or author for this submission.

Bisaloo commented 1 year ago

@editorialbot remind @VLucet in one week

editorialbot commented 1 year ago

Reminder set for @VLucet in one week

VLucet commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

VLucet commented 1 year ago

@Bisaloo @quang-huynh @kellijohnson-NOAA The new proof with all changes has been generated. @eric-pedersen and I have addressed all reviewers comments, and all changes have been integrated into the main branch. We thank you all again for your patience and thoughtful comments.

quang-huynh commented 1 year ago

The revision is quite thorough in the comparison of methods and software packages. It was helpful for me and the text is appreciated. The paper looks good to me, pending a DOI fix in Line 223.

The figures in the vignette have also been revised and looked good to me.

The Github repository just needs a section on: clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

quang-huynh commented 1 year ago

One more thing: the package can be used for more than just northern shrimp fisheries. Do the authors want to update the title to broaden the application? For example: "The sspm R package: spatial surplus production models for fisheries management" or "The sspm R package for spatially-explicit surplus production population models"

editorialbot commented 1 year ago

:wave: @VLucet, please update us on how things are progressing here (this is an automated reminder).

Bisaloo commented 1 year ago

Thanks for the changes @VLucet!

@quang-huynh @kellijohnson-NOAA, could you review your respective checklists (1, 2) and verify if the unchecked items have been addressed by the latest changes please? Thanks a lot for your patience and your work on this :pray:

quang-huynh commented 1 year ago

@Bisaloo all the checklist items have been met for me

VLucet commented 1 year ago

Hello @kellijohnson-NOAA, I just wanted to ask whether there was more you wished to post as issues or if I should merge the fixes I made in the PR already.

kellijohnson-NOAA commented 1 year ago

Sorry @VLucet that you had to ask. I have just one remaining check-list item that I have not decided if it is complete or not. You have many tests using testthat, which I think are great. But, I am wondering if you have a test that ensures the statistical properties of the answer. For example, if you were running a linear model you could test the parameter estimates against the output from lm() to ensure that the statistical properties of the model are implemented correctly. I am uncertain if you are doing that here or not and even how you would go about doing it for a complicated model. Nevertheless, I think at a bare minimum you could test a super simple example without spatial structure to ensure that the underlying code is working as planned. I have yet to be able to find anything like this in {sspm}. If the tests already exist, can you please point me to them? Thanks 🙏.

VLucet commented 1 year ago

Hi @kellijohnson-NOAA. No need for apologies!

The closest test we have is one that ensures the fitting of smooths using our fake dataset consistently outputs the "expected intercept", to ensure that if there was a change in how mgcv fits things we would notice the departure from that value. But it's just an "anchor point" for the reproducibility of results.

I understand your question as a general question about correctness of the statistical answer. I would find it difficult to test that programmatically for such a model. The model is using multiple building blocks from mgcv: the approach I tried to take for testing is to make sure those individual building blocks (for example, the penalty matrices) are accurately generated and put together. But ultimately, I am relying on mgcv's extensive testing (and our general understanding of how those building blocks should fit together, ie, the statistical approach we defend in the paper) to trust that those building blocks will indeed work well together and deliver the promised approach. The objects the package creates, are, ultimately, mgcv objects which can be inspected and checked (formulas, penalty matrices, etc..). Given that this package mainly wraps mgcv, I would say the answer to your questions is the ultimately the same one as "how to we make sure a GAM is statiscally correct?", which is out of scope for this package I would think.

I'll call on @eric-pedersen, to add to this, and maybe correct/clarify.

eric-pedersen commented 1 year ago

Hi @kellijohnson-NOAA. Thanks for the feedback. This is definitely an important consideration to think about.

As @VLucet mentioned, validating "statistical properties" is somewhat difficult given the breadth of possible models that can be fit in this framework, and the complexities of mgcv.

I could see adding a couple tests here to ensure that spm_smooth and spm produce the same estimated GAM coefficients and penalties as using the equivalent direct calls to gam with the same data; that way, we can show that the constructor functions (smooth_space, smooth_time, etc.) are producing the correct formula, and it will double as additional documentation: showing what GAM models spm_smooth and spm are estimating.

Would that work as a solution for this?

kellijohnson-NOAA commented 1 year ago

@eric-pedersen I think the tests you are proposing would be a good addition but I was thinking tests at a higher level of the surplus production model. Are there at a minimum tests of parameter estimates against simulated data? Ultimately, I think there should be tests against the results from another package that does estimates surplus production, e.g., MQMF or JABBA, if there is the ability to drop the spatial component from {sspm}. I guess I am trying to think of a reason to convince users that if you don't have the complication of space in your surplus production model you know that you will get the same answer as other packages that are doing the same thing. Knowing this can help increase user numbers because they can test their previous results against what {sspm} gives, which is often needed for advice that is given to managers before new software can be used.

eric-pedersen commented 1 year ago

@kellijohnson-NOAA: I see your point about the need for comparisons down the line between sspm outputs and other surplus production models. I agree that this is an important part of developing sspm as a stock assessment tool.

However, I'm not convinced that fits clearly into the form of a test of our model per se; since we would be comparing outputs from two different models, with different assumptions about model dynamics, it wouldn't be possible to directly compare our model parameters with other model estimates. For instance, JABBA uses a modified Schaefer model for the assumed population dynamics (i.e. $B_{t+1} = B_t + rB_t(1-B_t/K)$ whereas we are using a form of the Ricker growth model: $B_{t+1} = e^{rB_t - bB_t^2}$ (in the simplest form of our model, ignoring harvest and assuming constant productivity). While both haver exponential growth ($r$) and density dependence ($K, b$) the parameters and their estimates will not be directly comparable even with identical data. The same issues hold for sspm and MQMF. This is just a general issue with population models: unless the two models make very similar assumptions about population dynamics, it won't be possible to directly compare estimated parameters.

As such, any comparison between these different approaches would have to focus on comparing predictions: does our method predict similar changes over time, or similar aggregate values (such as $E_{MSY}$ and $B_{MSY}$) as other surplus production model methods. I think this kind of comparison is important to do, and is on my to-do list for this package, but would constitute enough effort and explanation to make up a separate publication or whole vignette, rather than a single unit test in the package test library.

kellijohnson-NOAA commented 1 year ago

Agreed @eric-pedersen and thank you for the discussion. I checked off the last box given the large number of current test and high coverage.

eric-pedersen commented 1 year ago

That's great to hear! Thanks for all of your feedback and thoughts on the paper; we really appreciate it.

Bisaloo commented 1 year ago

Thanks all for your participation in this process! The quality of the discussion in this thread makes me really happy :tada:. Since all boxes are now checked for both reviewers, we can switch to the final steps.

Bisaloo commented 1 year ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance

VLucet commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

VLucet commented 1 year ago

DOI of the release: https://doi.org/10.5281/zenodo.8015102 Github tag release: https://github.com/pedersen-fisheries-lab/sspm/releases/tag/v1.0.0

VLucet commented 1 year ago

@Bisaloo this is my first time using Zenodo like this, I am not sure I have done it as expected.

Bisaloo commented 1 year ago

Yes, it looks good :+1:! Could you please:

And please confirm that you've double checked authors and affiliations in the paper.

VLucet commented 1 year ago

@Bisaloo I think what I am confused about is that @quang-huynh has been made author on Zenodo (as it is to be expected as they contributed to the software via PR). Does that mean the authors on the paper (not on the software package) have to be updated to reflect that, or is it okay to have separate author lists? I am not opposed to adding a co-author to the paper (@eric-pedersen can chime in if otherwise), I am just asking what is the usual etiquette in this situation.