openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
703 stars 36 forks source link

[REVIEW]: Pyinterpolate: Spatial Interpolation in Python for point measurements and aggregated datasets #2869

Closed whedon closed 2 years ago

whedon commented 3 years ago

Submitting author: !--author-handle-->@SimonMolinsky<!--end-author-handle-- (Szymon Moliński) Repository: https://github.com/szymon-datalions/pyinterpolate Branch with paper.md (empty if default branch): Version: v0.2.5.post1 Editor: !--editor-->@hugoledoux<!--end-editor-- Reviewers: @chrisbrunsdon, @kenohori , @sdesabbata Archive: 10.5281/zenodo.6206145

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/3f87f562264c4e5174d9e6ed6d8812aa"><img src="https://joss.theoj.org/papers/3f87f562264c4e5174d9e6ed6d8812aa/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/3f87f562264c4e5174d9e6ed6d8812aa/status.svg)](https://joss.theoj.org/papers/3f87f562264c4e5174d9e6ed6d8812aa)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@chrisbrunsdon, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @hugoledoux know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @chrisbrunsdon

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @kenohori

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @sdesabbata

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 3 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @chrisbrunsdon it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 3 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/s11004-007-9129-1 is OK
- 10.1186/1476-072x-7-6 is OK
- 10.1080/13658816.2012.663917 is OK
- 10.2307/143141 is OK
- 10.1007/978-3-319-78999-6_29 is OK

MISSING DOIs

- 10.1007/978-3-642-58727-6 may be a valid DOI for title: Basic Linear Geostatistics

INVALID DOIs

- None
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

hugoledoux commented 3 years ago

@whedon add @kenohori as reviewer

whedon commented 3 years ago

OK, @kenohori is now a reviewer

hugoledoux commented 3 years ago

👋 @chrisbrunsdon & @kenohori

This is the review thread for the paper. All of our communications will happen here from now on.

Please read the "Reviewer instructions & questions" in the first comment above.

Both reviewers have checklists at the top of this thread (in that first comment) with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention #2869 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

whedon commented 3 years ago

:wave: @chrisbrunsdon, please update us on how your review is going.

kenohori commented 3 years ago

Hello!

I started looking at your submission a few days ago and decided that some initial feedback would be useful, especially regarding the paper. In short, I like the summary and statement of need, which are fine and clear. However, the rest of the paper is a general introduction to kriging, which doesn't say much about how it's performed in your software. I would recommend adding a short description of the different types of kriging you do and how they're implemented.

Also, your paper is missing references to the state of the field. What are the main (open or closed) alternatives out there? How does your software compare to them?

Regarding the software itself, I haven't finished testing it yet, but everything I've tested seems to work (with some very minor issues). However, in the meantime I've noticed a few things.

SimonMolinsky commented 3 years ago

Hi @kenohori!

Thank you for your feedback! Thanks for suggestions too, I'll describe implementation of Kriging techniques and I'll compare package to other programs available on Github or elsewhere. For the rest of your recommendations:

Installation/setup: it would be nice to separate the dependencies are handled by pip and those that need to be installed separately.

I don't understand what do you mean by that? Should I emphasize those installation requirements in two separate files, re-arrange actual documentation or maybe should I do something else (with installation itself?)

How can I test that the software is working correctly? Do you have some unit tests or reference output?

In the main module there is a test module for checking how functions behave (unit tests) => pyinterpolate/test. Usually I'm checking every change with available tutorials too, but it's more manual work instead of automated tests.

kenohori commented 3 years ago

I don't understand what do you mean by that? Should I emphasize those installation requirements in two separate files, re-arrange actual documentation or maybe should I do something else (with installation itself?)

My suggestion would be to summarise the instructions from the manual setup file in the main readme. It could be something like:

  1. install libspatialindex (eg with apt-get on Linux or brew on Mac)
  2. pip install pyinterpolate
  3. run tests

In the main module there is a test module for checking how functions behave (unit tests) => pyinterpolate/test. Usually I'm checking every change with available tutorials too, but it's more manual work instead of automated tests.

Thanks! I'll have a look at those. It would be nice to add this information to the Readme and paper as well.

hugoledoux commented 3 years ago

👋 @chrisbrunsdon, please update us on how your review is going

chrisbrunsdon commented 3 years ago

Apologies for the delays here - I'm currently involved in working on covid data analysis for the Irish govt - and things have recently got busy. Will start responding soon... Just to report that I have successfully downloaded and run the module without any major problems, and that suggestions mainly relate to examples given...

kenohori commented 3 years ago

Hello!

Just finished testing the software. As far as I can tell, every function in the software works. All unit tests are okay and the functionality in the tutorials seems to work fine.

However, there are some minor issues in the tutorials in v0.2.0, which I guess are due to name changes in code refactoring.

Looking forward to the updated paper!

SimonMolinsky commented 3 years ago

Hi @kenohori , Hi @chrisbrunsdon ,

I've updated paper and README.md in the paper branch of package repository. I'll check all namespaces in tutorials but I think that in current version (0.2.2) every tutorial should pass. Moreover, I'm changing areal dataset for tutorials because current dataset (road accidents) is not spatially correlated at this scale. It will be done in a day or two.

Thank you a lot for your feedback!

SimonMolinsky commented 3 years ago

Hi,

I hope that you're doing well in this COVID time. I'm writing to remind you about the paper. I'm going to update package during this week, along with it tutorials (examples) and documentation will be updated. I've changed example with road accidents data (not spatially correlated at the scale of interest) to Breast Cancer Cases in Pennsylvania State counties where spatial correlation occurs. Here are the most recent tutorials: https://github.com/szymon-datalions/pyinterpolate/tree/dev/tutorials

Thanks!

kenohori commented 3 years ago

Hello @szymon-datalions,

Thanks for the reminder. Sorry, I've been busy with teaching and haven't had the chance to go through it yet, but I promise to do it next week at the latest.

hugoledoux commented 3 years ago

@szymon-datalions sorry for the delay, it seems that @chrisbrunsdon doesn't have much time and I am looking for an alternative reviewer (hopefully today, but your review will be delayed a bit, since we require at least 2).

SimonMolinsky commented 3 years ago

Thanks @hugoledoux!

hugoledoux commented 3 years ago

@whedon add @sdesabbata as reviewer

whedon commented 3 years ago

OK, @sdesabbata is now a reviewer

SimonMolinsky commented 3 years ago

Hi,

I forgot to update paper here, but many changes were made! Additionally, new version of package was updated to PyPi.

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kenohori commented 3 years ago

Hello @szymon-datalions,

I've had a look at the revised paper. First of all, thank you for adding the sections on the state of the art, the types of kriging performed in your software and the new use case. I think the paper is much more complete and easier to follow.

That being said, I'm honestly a bit unsure about what to make of the new breast cancer use case. I'm (very much) not a medical doctor, but is there a reason to expect spatial correlation here? Maybe something environmental? Intuitively, I'd think that spatial correlation would be pretty weak, and the semivariograms indeed reach the sill very quickly. To me, it's maybe not the ideal use case to showcase your package.

Speaking about those semivariograms, I noticed that the distances appear to be in degrees. This means that they're not actual distances on the ground, so I think the dataset needs to be projected. Something like NAD 83 should be fine.

Finally, I think it would be very helpful to have a single section (with a diagram?) that clearly describes the types of kriging you can do, their requirements and the steps to perform them. Currently, that information is there, but it's dispersed throughout the text (in the statement of need, the spatial interpolation section and the structure of package).

SimonMolinsky commented 3 years ago

Hi @kenohori,

I'm extremely grateful for your comments!

You have right that spatial correlation in the case of cancer is weak. Usually Poisson Kriging in the case of public health studies serves rather as a smoother (outliers removal) and/or a kind of filter to avoid visual bias of choropleth maps related to the different population density patterns within administrative units. Anyway, you made mi think that example is not clear so I'm going with your suggestion and I'll change data to depict interpolation instead of filtering problem, maybe I'll show two examples instead of one and I'll pick different data source for it.

What to do with distances is clear. Diagram section is a great idea and I think that paper will be much more readable. Maybe I'll change few things in its structure to avoid unnecessary chaos.

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

SimonMolinsky commented 3 years ago

Hi @kenohori,

I've taken a look into a data and publications which were used by me as the reference for building this package. And I noticed that area of analysis for breast cancer was much larger than I've initially set it. I used single state but in the reference paper it is Northeastern part of United States created from multiple states. With larger area of interest spatial patterns are more visible so in this version of the paper given example shows stronger spatial correlation and reason why we perform an areal deconvolution is clear.

I've inserted table with each type of Kriging and I've rearranged the paper a little bit. The example data is projected in meters too. Thanks for your comments again, I learn a lot every time when something is improved in the paper :)

hugoledoux commented 3 years ago

👋 @sdesabbata could you please update us on your progress for this review?

sdesabbata commented 3 years ago

Hi @hugoledoux, as agreed, I should be able to complete this by mid-April. I hope to be able to get to it next week. 😄

sdesabbata commented 3 years ago

Hi @szymon-datalions,

I have started looking into your submission today. I was wondering whether you can clarify something about your example. Is this a bit of analysis that has been done in a published paper as suggested by your comment above?

I've taken a look into a data and publications which were used by me as the reference for building this package. And I noticed that area of analysis for breast cancer was much larger than I've initially set it. I used single state but in the reference paper it is Northeastern part of United States created from multiple states. With larger area of interest spatial patterns are more visible so in this version of the paper given example shows stronger spatial correlation and reason why we perform an areal deconvolution is clear.

If that's the case, the paper should be referenced in your submission and a comparison of the results should be presented.

SimonMolinsky commented 3 years ago

Hi @sdesabbata ,

You made a point, thanks. I'll update the paper accordingly.

SimonMolinsky commented 3 years ago

Hi @sdesabbata ,

One more thing. I've taken a look into mentioned paper but methodology and spatial extent is slightly different. Should I modify my example to fit the same path of analysis and then compare results? Now it seems to be pointless because my data is different and extent is larger. This paper was an example for me how big should be an extent.

sdesabbata commented 3 years ago

Hi @szymon-datalions,

I guess it depends what's the aim of the example. I assume that the aim is to demonstrate that your package does what's supposed to do. I haven't check the software and related tests yet, but I guess it would be good to include an example in the paper that does show that the software produces the expected results. That can be done in two ways: you take an example of analysis conducted with an industry-standard software and show that the output is the same; or you work with a simulated dataset with known characteristics and show that you can recreate a surface from a sample.

SimonMolinsky commented 3 years ago

Hi @sdesabbata ,

Ok, I'll do it and let you know once it's done! Thanks!

kenohori commented 3 years ago

Hi @szymon-datalions,

Thanks for the revised paper! I like the improvements very much.

Focussing on areal interpolation (of which there are fewer open implementations) is a great idea, and the explanation of why this is important helps a lot to clarify the strengths of your software. As before, I think this text is also a very nice introduction to kriging for those who are not familiar with it. Referring to kriging as a set of methods is a nice description.

I don't have much more to add at this point. Just some suggestions related to minor formatting things:

hugoledoux commented 3 years ago

Hi @sdesabbata ,

Ok, I'll do it and let you know once it's done! Thanks!

@szymon-datalions & @sdesabbata : I don't want to let this review drag too much, so could you update me on the status please? By reading it seems that @sdesabbata is waiting for @szymon-datalions to update the paper before continuing the review? I'd like to point out that many of the checkboxes above can be processed without that update (if I am not mistaken).

SimonMolinsky commented 3 years ago

Hi @szymon-datalions,

Thanks for the revised paper! I like the improvements very much.

Focussing on areal interpolation (of which there are fewer open implementations) is a great idea, and the explanation of why this is important helps a lot to clarify the strengths of your software. As before, I think this text is also a very nice introduction to kriging for those who are not familiar with it. Referring to kriging as a set of methods is a nice description.

I don't have much more to add at this point. Just some suggestions related to minor formatting things:

  • using \times for matrix sizes
  • broken reference at the end of Spatial Interpolation with Kriging
  • shrinking the table text would make it fit in the text block
  • broken link to the GitHub repository at the end of Dataset

Thanks @kenohori; and I'll correct those formatting issues in the paper :)

@hugoledoux I've performed requested tests and I'll update paper by the end of this week so it'll be (hopefully) all from my side.

Thanks!

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

SimonMolinsky commented 3 years ago

@whedon generate pdf from branch paper