Closed whedon closed 2 years ago
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Hi,
I've made a major update in the paper and the package.
@sdesabbata I've included comparison to gstat
Kriging interpolation at the end of a paper (new chapter before Bibliography section),
@kenohori Thank you for your help, I've updated paper accordingly, but there's a problem with a table width. Locally I'm able to control it with HTML tags but after compilation table no longer fits into the page. @hugoledoux are there any guidelines how to format tables in a paper?
I've made few changes in figures, to save some space,
This is all from my side. I'm waiting for further comments or acceptance of the paper,
Thanks a lot, Szymon
@szymon-datalions while being precise in the paper is important, now the paper is rather long, way longer than typical papers. Could we make sure that it stays <10pages? You could for instance merge figures 2-3 and 4-5 together, and maybe the use-cases could be in a wiki in the repository or in the docs? The point is to "Mention (if applicable) a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it.", but all the details should be moved somewhere else.
The point of JOSS is having a short paper, which is just a windows on the code/docs. See what it should contain: https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain
* @hugoledoux are there any guidelines how to format tables in a paper?
hmmm, I am not sure how. You can resize figures like this, perhaps this works for tables?
![figure capction](extrusion.png){ width=90% }
If not we'll deal with this at the end.
@szymon-datalions while being precise in the paper is important, now the paper is rather long, way longer than typical papers. Could we make sure that it stays <10pages? You could for instance merge figures 2-3 and 4-5 together, and maybe the use-cases could be in a wiki in the repository or in the docs? The point is to "Mention (if applicable) a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it.", but all the details should be moved somewhere else.
The point of JOSS is having a short paper, which is just a windows on the code/docs. See what it should contain: https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain
Sure, I'll do it and I'll move some sections into external notebooks with the links in the appendix.
@whedon generate pdf from branch paper
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@hugoledoux I've changed paper - I moved two big blocks into separate files in the paper repository. One block is a use case and second block is a comparison of methods between two different packages but focused mostly on the specific solution. The main parts of paper are still here and I've managed to shrink it to 9. pages.
Table is still larger than the text field, this method don't work. I'll try to find a solution - maybe I'll insert table as an image to avoid problems with formatting. But maybe this part should be done later? I still need one more review to go.
/ooo May 22 until June 5
Brilliant, thank you very much for your work on the paper. I will resume my review and get back to you in a few days.
@whedon generate pdf from branch paper
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@whedon generate pdf from branch paper
Attempting PDF compilation from custom branch paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@hugoledoux I've removed table from the paper and create numbered list from it. I don't know why but in the previous compilations wheadon
has compiled paper from the custom branch paper
but after while it was changed to the version from the main
branch. You may check timestamps: I've updated paper in the custom branch paper
and three minutes later I've initialized wheadon
. When I had checked paper everything was ok but after few minutes I checked it again and I saw version from the main
branch.
So I updated main
branch to overcome this problem but I don't know if this is something on my side or something with wheadon
.
@sdesabbata now paper is really short and some parts are moved to the supplementary materials (comparison of capabilities and output, use case). If you want to check comparison of the package capabilities to existing software it is described here: https://github.com/szymon-datalions/pyinterpolate/tree/main/paper/supplementary%20materials
@sdesabbata kind reminder not to forget this review, the paper was update last week by the author
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@sdesabbata Hi, I've restructured paper slightly to make it more clear and accessible. It is not a big change, rather I gave it the finishing touch.
As agreed with @hugoledoux I am waiting for a colleague of my with regard to the discussion of Kriging. In the meantime, here are some detailed feedback on the current version of the paper.
I think the paper is generally clear and I think the additional material and the examples work fine. The key point that requires further clarification in my view is the automation of the analysis, which is mainly discussed in line 98 to 105. The outcome of a Kriging process are based on a number of assuptions made by the spatial analyst about the underlying process. Changing assumptions can significantly change the output. In particular, the semivariogram model (I assume that is what you refer to by when you say "teoretical model") is crucial. The paper should provide further details about how such process is automated. That is mentioned briefly in the paragraph starting in line 130 and in the case studies. A more clear reference from the paragraph starting in line 98 to the one starting in line 130 should be made, and further details should be provided. I guess the automation is particularly concerning when drift is present in the data, and an estimation of the errors might not be as good the analysis of an analyst. That might also relate to the process of outliers removal (line 171), where the paper would also benefit from some further details.
With regard to the general structure, would it not make sense to discuss Kriging as a method first (section "Method") and then the implementation (section "Interpolation methods within Pyinterpolate") before the details of the modules (section "Modules")?
Try be consistent with regard to the use of the terminology. Also, try to separate the discussion about data and the transformation of the data from the use of the term "map", where the latter is a representation of the data, not the data themselves.
@sdesabbata Thanks for your feedback. I'll go point by point and I'll let you know about changes / additional explanations / corrections when they'll be done.
One point I'd like to make clear: the paper should in general be kept short, now it's 9p with the references and it shouldn't be made longer, ideally.
The recommended errors/typos and clarifications proposed by @sdesabbata should be implemented, but adding new sections about how certain parts (eg "The paper should provide further details about how such process is automated.") could be put in the docs of the software package. The paper is supposed to introduce the package and explain why it is necessary, JOSS prefers that the details of the package and how to use it are in the docs.
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@sdesabbata I've updated paper with regards to your comments and advices. It's slightly longer, but I've created two additional files within the package documentation related to the Outliers Detection and to the Semivariance fitting process. Links to those files are given in the appendix. I've spent some time with the grammar too and I hope that now it's much better than before.
Thank you
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@sdesabbata I've spent additional time with the paper polishing. I hope that now it's much better grammatically and more apparent than in the previous iteration. I hope that it'll help with the review,
Hi,
@hugoledoux It's just a reminder about the package. I plan to release a new version at the beginning of 2022, so if there is any chance to pass this review further I'll be very grateful.
@kenohori & @sdesabbata could you resume the review and update us? The author updated the paper and the package.
Please let us know
I was already happy with the previous version of the paper, so I only had a quick look at the updated version. It is fine by me: reads well and has all required elements. I assume this is sufficient? Or do you want me to test the updated software?
no @kenohori, no need to to test all the updated software again.
Thank you, @SimonMolinsky, for the edits to the paper. I think the new version reads much better, and it provides an improved showcase of the package. However, in my second reading, I have found two issues, one of which took me some time to clarify.
The first issue relates to the Spatial Interpolation with Kriging section. There seems to be an error in equation 2 and 3, or rather in their description. Line 90 says that mu
is "a process mean" (which is in itself unclear as a sentence), but I believe that is a Lagrange multiplier. Armstrong (1998) uses the same notation as equation 3 in her equation 7.9, and she describes it as the Lagrange multiplier. Her equation 7.11 is a bit different from equation 2, but equation 2 seems correct based on other sources, assuming that mu
is the Lagrange multiplier rather than "a process mean".
I have looked at the code for the related functions in the package. As far as I can tell, the code seems fine, but I think it is fundamental that you double-check, and I would be grateful if the other reviewers could do the same.
There are some other minor issues with notation. Most summations don't include an upper bound, and gamma
should be included as the left part of equation 4 to clarify it.
The second issue relates to the Ordinary Kriging of meuse dataset example in the additional materials. The example is a very good showcase of the package, and the R
gstat
package is a great choice. However, the scatterplot and final maps illustrate patterns (highlighted in red in the images below) in the pyinterpolate
output that I believe would require further investigation. Furthermore, the two error maps seem to produce a similar spatial pattern but with different distributions. However, that is difficult to assess without a legend. I suggest adding a legend for each one of the four maps in the notebook and further investigating the difference and patterns in the errors.
I have looked at the code for the related functions in the package. As far as I can tell, the code seems fine, but I think it is fundamental that you double-check, and I would be grateful if the other reviewers could do the same.
@sdesabbata could you please tick the boxes above so that we can clearly identify what is left?
And the code has been checked already by the other reviewer, so I assume this is okay.
@SimonMolinsky can you address the notation issue raised? And I agree that adding legends to the maps is required.
@sdesabbata,
Thank you for your comments. You were right with the description of Lagrange Multiplier, it was a mistake that I've described this parameter as a "process mean". I've checked code to be sure that the implementation doesn't use any "mean" too :) and there is a correct Lagrange Multiplier in the final equation of Ordinary Kriging.
I'll change descriptions and notations, add legends to the maps and validate those patterns in a matter of few days.
Thank you @SimonMolinsky. I have re-run the example and added the legends to better review the results. I have created a pull request for the paper repo.
@hugoledoux Based on the replies above, the presence of those patters is the only concern I have at the moment.
@sdesabbata Thanks a lot! So now I'll investigate those patterns.
Hi @sdesabbata ,
I've found the root cause of the correlogram patterns. It is related to the theoretical variogram parameters (sill and range). If I use parameters from the gstat
package, I obtain a much closer correlogram. I've updated the notebook with it; you can compare both theoretical models: https://github.com/SimonMolinsky/pyinterpolate-paper/blob/main/paper-examples/example-gstat-comparison/Ordinary%20Kriging%20of%20Meuse%20dataset.ipynb
I think that the rest of the differences are a product of the nearest neighbors selection and a different lag calculation. The point pair per lag in gstat
is selected in a space between two distances (lags). Distances between lags may differ in size. In pyinterpolate
point pair distance per lag is estimated from space within halved distance to the previous lag and a halved distance to the next lag (so lag is a distance in the middle of this interval). Those intervals are equal.
I don't know if I should force the package to work as gstat
- I will ensure that the user may provide their list of lags (it seems to be more important at this stage).
I'm more concerned about the error output (patterns in error maps). I must investigate it further, especially why changes in patterns are so abrupt. To do so, I must take a look into gstat
implementation of Kriging. Again, I think that this issue may be related to the neighbors selection in pyinterpolate
.
Hi @sdesabbata ,
I've found why the error map is wrong - and it affects much more than errors only. It is related to the function that picks the closest neighbors:
def prepare_kriging_data(unknown_position, data_array, number_of_neighbours=10):
"""
Function prepares data for kriging - array of point position, value and distance to an unknown point.
INPUT:
:param unknown_position: (numpy array) position of unknown value,
:param data_array: (numpy array) known positions and their values,
:param number_of_neighbours: (int) number of the closest locations to the unknown position.
OUTPUT:
:return: (numpy array) dataset with position, value and distance to the unknown point:
[[x, y, value, distance to unknown position], [...]]
"""
# Distances to unknown point
r = np.array([unknown_position])
known_pos = data_array[:, :-1]
dists = calc_point_to_point_distance(r, known_pos)
# Prepare data for kriging
kriging_output_array = np.c_[data_array, dists.T]
kriging_output_array = kriging_output_array[kriging_output_array[:, -1].argsort()]
prepared_data = kriging_output_array[:number_of_neighbours]
return prepared_data
This function takes n-closest neighbors instead of the closest neighbors based on the spatial distance. I've performed a few experiments and obtained results similar to gstat
. I'll update the package and paper example accordingly and let you know when it's done. Thanks!
@whedon generate pdf
PDF failed to compile for issue #2869 with the following error:
/app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:147:in `check_fields': Paper YAML header is missing expected fields: affiliations (RuntimeError)
from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:89:in `initialize'
from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `new'
from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `set_paper'
from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:58:in `prepare'
from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:131:in `<top (required)>'
from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `load'
from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `<main>'
@whedon generate pdf
Submitting author: !--author-handle-->@SimonMolinsky<!--end-author-handle-- (Szymon Moliński) Repository: https://github.com/szymon-datalions/pyinterpolate Branch with paper.md (empty if default branch): Version: v0.2.5.post1 Editor: !--editor-->@hugoledoux<!--end-editor-- Reviewers: @chrisbrunsdon, @kenohori , @sdesabbata Archive: 10.5281/zenodo.6206145
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@chrisbrunsdon, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @hugoledoux know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Review checklist for @chrisbrunsdon
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @kenohori
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @sdesabbata
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper