JOSS Review and related comments

zsylvester commented 3 years ago

Hi @amoodie and coauthors,

I have enjoyed reading your paper and have enjoyed exploring pyDeltaRCM even more. Clearly a ton of good work went into this; I think this is a great delta model and Python package and I am hoping to use it in the future. I am convinced that many others will find it useful as well. I have not gone through all the documentation, tests, and examples in detail -- there is a lot to explore here and I think that going through every detail is beyond the scope of a review of a short JOSS paper (but correct me @kbarnhart if I am wrong). I have a few comments below but I don't think that addressing any of these are critical for publishing the paper in JOSS; however, hopefully some of them are useful for improving the package. Here is the list:

Maybe this just reflects my addiction to Jupyter notebooks, but I think it would be useful to provide some example notebooks (or at least one). For a lot of people, it is easier to get started with a package if there is a ready-to-go notebook in the repository.
The 10-minute tutorial is great, but it would be nice if you made it a bit more obvious how one can generate a nice-looking delta morphology with well-defined channels, e.g., by showing how running the default model for 1000 iterations (which might take a few minutes, but that's OK) gives a nice result.
The workflow of adding non-default parameters to a minimalistic yaml file works well (although it is not very clear from the documentation that this is a best practice).
However, I have tried to create an input 'yaml' file with all the parameters and ran into several problems. I have copied the parameter list from here and found that several of the default values listed there give an error message when I am trying to read the file. Also, those error messages look like "input for Np_water not of the right type in yaml configuration file {self.input_file}. Input type was {type(k).name}, but needs to be {expected_type}". I have got it to work in the end, but, long story short, it would be useful to provide an example input file with all the default parameters so that the user can have an overview of all the parameters and start changing them.
Not clear what should be used in the input parameter file if you don’t want to specify that value (it looks to me like ‘None’ and ‘null’ don’t work) - I guess you just comment out / delete that line?
In the description of the 'stepmax' parameter (here), the phrase “stepmax is the maximum number of jumps a parcel can make” is repeated twice.
In the same section, fix the typos in “oceaninc bounary”.
The 10-minute tutorial should point out that, if the model is not ‘finalized’ it is possible to update the model with more time steps.
I understand that dissecting and analyzing the output of pyDeltaRCM in detail is beyond the scope of this package and that you are working on another package to do that (looking forward to trying that package out as well!), but I think it would be helpful (and would make the package more appealing) if you provided a simple example of basic visualizations, e.g., how to save depth grids from the model, read back the netCDF4 file, and display the results. Something as simple as this would save a lot of time for a lot of people: from netCDF4 import Dataset; nc = Dataset('pyDeltaRCM_output.nc'); plt.imshow(nc['depth'][100,:,:], cmap='viridis_r')
The paper says that the model is compatible with the Basic Model Interface, but the latest version on Github states that it is not.
The installation instructions: when creating a new conda environment it is a good idea to specify the Python version (at least with my setup).
It would be good to get a rough idea about performance (e.g., how long does it take to create a 'nice' delta model?). Right now this is not clear at all after reading the paper.
It might be helpful to mention in the paper the advantages / disadvantages of using a reduced complexity model. Some readers might not realize how hard it is to install and run a more sophisticated morphodynamic model like Delft3D.

That's it for now -- thanks for building this and making it open source!

https://github.com/openjournals/joss-reviews/issues/3398

kbarnhart commented 3 years ago

Thanks for this great review @zsylvester !

amoodie commented 3 years ago

response to zsylvester review

I have enjoyed reading your paper and have enjoyed exploring pyDeltaRCM even more. Clearly a ton of good work went into this; I think this is a great delta model and Python package and I am hoping to use it in the future. I am convinced that many others will find it useful as well. I have not gone through all the documentation, tests, and examples in detail -- there is a lot to explore here and I think that going through every detail is beyond the scope of a review of a short JOSS paper (but correct me kbarnhart if I am wrong). I have a few comments below but I don't think that addressing any of these are critical for publishing the paper in JOSS; however, hopefully some of them are useful for improving the package. Here is the list: Thank you Zoltán for your kind words and for taking the time review our project. We appreciate the suggestions you made, and have responded to each of them below.

Maybe this just reflects my addiction to Jupyter notebooks, but I think it would be useful to provide some example notebooks (or at least one). For a lot of people, it is easier to get started with a package if there is a ready-to-go notebook in the repository. Thanks for this suggestion. We added a Jupyter Notebook (link) that includes creating a YAML file, running the model for 1000 timesteps, and loading the output NetCDF file and creating a timeseries plot.
The 10-minute tutorial is great, but it would be nice if you made it a bit more obvious how one can generate a nice-looking delta morphology with well-defined channels, e.g., by showing how running the default model for 1000 iterations (which might take a few minutes, but that's OK) gives a nice result. Thanks for this suggestion. We debated this idea internally a few months ago, and decided that we wanted to keep our documentation build time less than ~3 minutes. As a result, we can't run the model for more than a few timesteps anywhere in the documentation. We take advantage of the pyDeltaRCM checkpointing feature to have one pre-computed model result for visualization in the documentation (e.g., the delta centered on the docs homepage, and this precomputed delta took approximately two hours to run. We prefer not to add a longer run to the built documentation, but we did add a Jupyter Notebook (in response to your suggestion above), which includes a 1000-timestep run (link).
The workflow of adding non-default parameters to a minimalistic yaml file works well (although it is not very clear from the documentation that this is a best practice). Thanks for pointing this out. We agree we should emphasize best practices. We have added a callout "important" box to the documentation in the 10-minute Guide here and the User Guide here.
However, I have tried to create an input 'yaml' file with all the parameters and ran into several problems. I have copied the parameter list from here and found that several of the default values listed there give an error message when I am trying to read the file. Also, those error messages look like "input for Np_water not of the right type in yaml configuration file {self.input_file}. Input type was {type(k).__name__}, but needs to be {expected_type}". I have got it to work in the end, but, long story short, it would be useful to provide an example input file with all the default parameters so that the user can have an overview of all the parameters and start changing them. Thanks for pointing out this issue. We fixed the broken error message here in #220. We hope that with the added clarification on best practices (above and below comments) in the documentation, it will be clear how to set parameters in the model.
Not clear what should be used in the input parameter file if you don’t want to specify that value (it looks to me like ‘None’ and ‘null’ don’t work) - I guess you just comment out / delete that line? Thanks for raising this issue. Yes, you should comment out or remove the line completely; setting null does not work. We added a call-out box suggesting to comment out lines to the Advanced Configuration guide (here in #223) and to the User Guide (here in #223).
In the description of the 'stepmax' parameter (here), the phrase “stepmax is the maximum number of jumps a parcel can make” is repeated twice. Thanks for pointing this out, we revised the text here to make it less repetitive (here in #223)
In the same section, fix the typos in “oceaninc bounary”. Thanks for catching this typo, we fixed this in #220.
The 10-minute tutorial should point out that, if the model is not ‘finalized’ it is possible to update the model with more time steps. Thanks for this suggestion, we added a note to explain that more updates can be made, until finalize is called (here in #220).
I understand that dissecting and analyzing the output of pyDeltaRCM in detail is beyond the scope of this package and that you are working on another package to do that (looking forward to trying that package out as well!), but I think it would be helpful (and would make the package more appealing) if you provided a simple example of basic visualizations, e.g., how to save depth grids from the model, read back the netCDF4 file, and display the results. Something as simple as this would save a lot of time for a lot of people: from netCDF4 import Dataset; nc = Dataset('pyDeltaRCM_output.nc'); plt.imshow(nc['depth'][100,:,:], cmap='viridis_r') Thanks for this suggestion. We have added a brief explanation for how to connect to the NetCDF file and examine the data. Any analysis is outside the scope of what we want to include in the pyDeltaRCM documentation, though. The example is included in the User Guide (see it rendered here) and the informational section of the model (added here in #223).
The paper says that the model is compatible with the Basic Model Interface, but the latest version on Github states that it is not. Thanks for pointing out this confusing language on the Github README. In development, we moved the BMI wrapper outside of the pyDeltaRCM repository. So, pyDeltaRCM is BMI-compatible via the wrapper and it's incredibly easy to use. None of us are currently using the BMI wrapper, but we think this JOSS publication would be a good opportunity to advertise to the community that the model is BMI-ready. We updated the README to hopefully make this more clear (here in #223).
The installation instructions: when creating a new conda environment it is a good idea to specify the Python version (at least with my setup). Thanks for this suggestion. We have made this change (here in #220).
It would be good to get a rough idea about performance (e.g., how long does it take to create a 'nice' delta model?). Right now this is not clear at all after reading the paper. Thanks for this suggestion, we added the run time for the delta in Figure 1 to the caption of that figure (here in #224) . That simulation took 1 hr 57 minutes on my personal workstation. For the sake of completeness in this answer: computational time depends on parameter choices, and especially the number of grid cells in the model domain. Larger domains with smaller grid cells take exponentially longer to compute, for example more than 12 hours for some cases in a recent research study.
It might be helpful to mention in the paper the advantages / disadvantages of using a reduced complexity model. Some readers might not realize how hard it is to install and run a more sophisticated morphodynamic model like Delft3D. Thanks for this suggestion. In the paper "Statement of need" section, we say "The DeltaRCM delta model (Liang, Voller, et al., 2015) has gained popularity among geomorphologists due to an attractive balance of computational cost, realism, and interpretability (Larsen et al., 2016)." We believe this gives readers a sense of why the reduced complexity approach is popular, without making the explanation too verbose for the short JOSS paper (e.g., we include the reference to Larsen et al., 2016).

That's it for now -- thanks for building this and making it open source! Thanks again for your thoughtful suggestions, we appreciate the review!

zsylvester commented 3 years ago

Thanks @amoodie for the careful responses - all is good as far as I am concerned. I did bump into a question in the meantime (that I could probably find the answer to in the documentation / published papers, but this is probably faster...): if I run the simulation with a smaller grid size (e.g., 10 m instead of 50 m -- I wanted a similar-looking, but higher resolution model), and every other parameter stays the same, I get a very different delta (in terms of channel size, sinuosity, etc.). Is there a best practice to deal with this? Or is there a fundamental dependence of the model on grid cell size?

amoodie commented 3 years ago

Hi @zsylvester, thanks again for your review!

Yes, there is a fundamental grid size dependence in the model. @elbeejay has investigated the issue a bit, and he has some evidence that the avulsion dynamics of the system remains approximately the same regardless of the grid size. Still though, we think the cell-size dependence needs to be studied more carefully and don't have any best practices identified. In the original DeltaRCM Liang et al., 2015a,b, 2016a,b papers, the authors used grid sizes of 50 and 100 meters, but didn't comment on the dependence, as far as I can remember. So far, we have chosen according to whatever the coarsest size that resolves the dynamics needed to address the science question (usually 50--150 m).

DeltaRCM / pyDeltaRCM

JOSS Review and related comments #219

response to zsylvester review