ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

mapscanner #330

Closed mpadge closed 4 years ago

mpadge commented 5 years ago

Submitting Author: mark padgham (@mpadge)
Repository: mpadge/mapscanner Version submitted: 0.0.1 Editor: @sckott
Reviewer 1: @potterzot
Reviewer 2: @khondula Archive: TBD
Version accepted: TBD


Package: mapscanner
Title: Print Maps, Draw on Them, Scan Them Back In
Version: 0.0.1
Authors@R: c(
    person("Mark", "Padgham", email="mark.padgham@email.com", role=c("aut", "cre")),
    person("Michael D.", "Sumner", role = "aut", email = "mdsumner@gmail.com"),
    person("Stanislaw", "Adaszewski", role = "cph",
        comment = "author of include concaveman-cpp code"))
Description: Print maps, draw on them, scan them back in, and convert to spatial
    objects.
License: GPL-3 | BSD_2_clause + file LICENSE
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
Depends: R (>= 3.5.0)
Imports:
    cli,
    curl,
    dplyr,
    fs,
    glue,
    jpeg,
    magrittr,
    magick,
    memoise,
    pdftools,
    png,
    purrr,
    raster,
    Rcpp,
    reproj,
    RNiftyReg,
    sf,
    slippymath,
    tibble,
    RTriangle,
    gibble,
    polyclip,
    unjoin,
    rlang
Suggests: 
    ggplot2,
    knitr,
    lwgeom,
    mapview,
    osmdata,
    rmarkdown,
    testthat,
    spelling
LinkingTo: Rcpp
SystemRequirements:
    C++11
VignetteBuilder: knitr
URL: https://github.com/mpadge/mapscanner
BugReports: https://github.com/mpadge/mapscanner/issues
RoxygenNote: 6.1.1
Language: en-GB

Scope

mapscanner fits nicely into both data extraction and geospatial data. It is definitely geospatial because it concerns maps, and ultimately generates spatial output in standard (sf) format. It is nevertheless primarily a "data extraction" tool that extracts useful (spatially) structured data from unstructured input in either pdf or image formats.

The target audience comprises social researchers interested in spatial phenomena such as subjective perceptions of regions or areas, or perceptions of location. More specifically, these kinds of analyses are currently restricted to (highly) commercial service provision, and the package aims to aid and empower social researchers in financially disadvantageous circumstances by providing analytic abilities that would otherwise be prohibitively expensive.

No R packages whatsoever that we are aware of. Vaguely similar software exists in other languages, such as mapmatcher (in java; long abandoned) and Maplat (.js; mostly in Japanese, and comparable only in algorithmic terms; purpose is entirely different).

Pre-submission issue #322, approved for submission by @noamross :fireworks:

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options - [x] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [x] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [x] The package is deposited in a long-term repository with the DOI: [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3647922.svg)](https://doi.org/10.5281/zenodo.3647922) - (*Do not submit your package separately to JOSS*)
MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

annakrystalli commented 5 years ago

Thanks for the submission @mpadge!

I'm just working on assigning an editor. Stay tuned!

sckott commented 5 years ago

Editor checks:


Editor comments

Thanks for your submission @mpadge !

Here's the output from goodpractice. You don't need to address these now, it's info for reviewers to use to get started.

── GP mapscanner ─────
It is good practice to

  ✖ avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines
    shorter than 80 characters

    R/RcppExports.R:19:1

  ✖ avoid 'T' and 'F', as they are just variables which are set to the logicals 'TRUE' and 'FALSE' by default, but are not reserved words and hence can
    be overwritten by the user.  Hence, one should always use 'TRUE' and 'FALSE' for the logicals.

    R/aggregate_polys.R:NA:NA
    R/aggregate_polys.R:NA:NA
 ───────────────

Seeking reviewers now 🕐


Reviewers:

sckott commented 5 years ago

reviewers are @potterzot && @khondula

mpadge commented 5 years ago

Thanks @sckott - I'm really excited about this package, and looking forward to improvements that will invariably arise in response to reviews. Thanks in advance @potterzot and @khondula for volunteering! The goodpractice T/F outputs are indeed false positives; the long line was not, but has since been fixed.

I'm not sure at the mo what might have caused your install issue, but guess some kind of inter-dependency between imported packages? I'll try a clean docker install asap and see if i can replicate.

sckott commented 5 years ago

@potterzot reminder that your review is due soon

potterzot commented 5 years ago

In just under the deadline! @mpadge thank you for your submission, this is a super cool package. We may end up using it in some work we're doing with water rights, so it's great to see! That said, I couldn't actually get it to download maps given a bbox, but looking forward to trying it out once I can.

I printed a USGS topo map and draw a trail on it to check it's application for hiking and was excited with the results. While the package is explicitly aimed at researchers, another use case would be hikers and runners who want to plan a route but don't want to click on an online map. This would let you draw the route, scan it in to the map, and have it uploaded as a shape. If it could be then converted to a GPX file that would be really neat.

I've used the below checklist from the rOpensci review template. Comments at the end are my additions. Please feel free to comment with any clarifying questions.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [X] A short summary describing the high-level functionality of the software
  • [X] Authors: A list of authors with their affiliations
  • [X] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [X] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 8


Review Comments

General Comments

The package seems great. There are some problems I had with checking and running an example, and some code style differences. I left check boxes above unchecked where there were comments I made to suggest changes. The Statement of Need and Vignette sections above are unchecked beccause the README does not have a clear statement of use, although the vignette does. I would suggest just copying that to the README, or possibly making it a bit simpler as well. The vignette is fine but was not building properly for me. The Functionality and Automated Tasks checkboxes are unchecked because I received an error when trying to run the example from the README, detailed below and I think that automated tests of the API may help with identifying possible issues when attempting to download maps. Finally, Package Guidelines is unchecked because of style discrepancies that are detailed below.

Specific Comments

Errors
File mapshot-polys.png not found in resource path
bbox <- rbind (c (-117.25000, -117.37500),
               c (45.12500, 45.16667))

ms_generate_map (bbox, max_tiles = 16L, mapname = "eagle_cap")
#> Error in curl::curl_download(url = api_query, outfile) : HTTP error 401.

curl::curl_download() works with a simple html file, so it doesn't seem to be curl that is breaking.

Documentation

Is the documentation (installation instructions/vignettes/examples/demos) clear and sufficient? Does it use the principle of multiple points of entry i.e. takes into account the fact that any piece of documentation may be the first encounter the user has with the package and/or the tool/data it wraps?

    Error in curl::curl_download(url = api_query, outfile) : HTTP error 401.
Code Style
Code
sckott commented 5 years ago

thanks for your review @potterzot !

mpadge commented 5 years ago

Thank you very much @potterzot for your very considered and constructive review. Before going any further can I please ask you to:

  1. Forgive my programmatic stupidity for somehow embedding a token <- Sys.getenv() call in the code and not issue any information whatsoever about the necessity or assumed name of associated token. I've rectified that already, so the code now explicitly informs you of the necessity of such a token and expected naming conventions, and should no longer fail once you've set it. Please accept my humble apologies for that blundering oversight!
  2. Would you be so kind as to try again with current version to ensure that you can successfully generate a map, and maybe amend your review accordingly, or append any new insights you might have?

(I haven't yet updated the README and associated docs & vignettes to more explicitly inform of the necessity of a mapbox token, but will do so asap.) In all other ways, your review is very helpful, and I concur with all points you raise, including my ill-chosen use of "scan" throughout. I'll provide a full response as soon as I return from current travels in 2 weeks or so. Note also that your vision of additional use case is brilliant, and one which absolutely should and will be addressed. It's also one which I've been ignoring because of this issue which details how surprisingly difficult it is to extract lines -- and in particular line start and end points -- from raster objects. I really want to solve this, and your use case provides the perfect impetus. Please feel free to chip in any insights you might have on the associated issue as well, because we'd love any and all help in cracking that one. @sckott Any insights from you would also be hugely appreciated!

potterzot commented 5 years ago

@mpadge ah okay, no problem. I have added the code and README/vignette revisions to my review comment and am happy to take a second look after your review response.

Adding line extraction would be really neat, but I also don't think it needs to be done as part of the review process. Maybe a later feature down the road.

sckott commented 5 years ago

@khondula plz and thanks, your review is due in a few days

mpadge commented 5 years ago

Thanks @potterzot for the very productive and helpful review, for which I now offer the following responses, quoting your review comments throughout as appropriate. But first:

need for mapbox API token

The previous inability to use one of the main functions was caused by my hard-coding an API token as environmental variable, as described above. This has been rectified so that the package will automagically work if it detects any kind of environmental variable with "mapbox" or "mapscan" in the name, otherwise it will prompt the user with the following:

"Map generation requires a mapbox API key to be set with Sys.setenv 
with a name that includes either the strings 'mapbox' or 'mapscanner'. 
Tokens can be obtained from 
https://docs.mapbox.com/api/#access-tokens-and-token-scopes"

review responses

There are ... some code style differences.

These should have now been rectified, particularly by rendering the code style of the distinct aggregate_polys.R function to be consistent with the remainder of the package.

The Statement of Need and Vignette sections above are unchecked beccause the > README does not have a clear statement of use, although the vignette does. I > would suggest just copying that to the README, or possibly making it a bit > simpler as well.

and further:

README.md: Some expansion of the text would make the package more friendly. The usage section lacks an initial description of the goal of process so it is a bit confusing what it happening (The opening paragraph of the vignette does this well for eample). It's probably worth explaining what "rectification" is for the casual user. Some sentences are incomplete, e.g. "Package comes with ...". The text in the vignette and some of the function descriptions would also be nice in the README, in accordance with the [principle of multiple points of entry:

The initial vignette sections describing package motivation and principles of usage have been copied to the README as suggested.

The vignette is fine but was not building properly for me... The > Functionality and Automated Tasks checkboxes are unchecked because I received > an error when trying to run the example from the README, detailed below

I presume these issues were because of my previous error in handling mapbox API keys, and hope that it should now build once a key has been set via appropriately named environmental variable. (Sys.setenv ("...MAPBOX...") = <my_key>).

I think that automated tests of the API may help with identifying possible > issues when attempting to download maps.

While the package has 100% coverage, this is admittedly because the bit of code directly responsible for downloading the maptiles is currently not tested. These lines could be mock tested, but would nevertheless require an API key that would have to be hard-coded in one particular way, and would thus still potentially not capture all possible ways by which such keys might be set and used. The new code for dealing with tokens via environmental variables should be failsafe for most use cases, and should issue sufficiently clear guidelines to ensure that tokens are appropriately set, as described above.

Errors

devtools::check() fails with this error when trying to build the vignette: File mapshot-polys.png not found in resource path

When I try to run the code from the README, I get the following error ... > curl::curl_download() works with a simple html file, so it doesn't seem to be > curl that is breaking.

The latter, and as far as I can tell, the former as well, were both caused by previous lack of appropriate, and appropriately described need for, mapbox token.

Documentation

I would also add more information about where the maps are being downloaded from.

Documentation for ms_generate_map() now states that the function:

Generate(s) a map image for a specified area or bounding box, by downloading tiles from \url{https://mapbox.com}.

(See this commit).

Since I got an error (detailed below) when attempting to generate a map, instead I printed a pdf of a map that I had, drew on it, and then scanned it in. The maps are 2.2MB and 9.9MB. The ms_rectify_maps() function ran for three hours and then it was getting close to the time I had to leave so I killed it. It did not exit cleanly. Providing a check interrupt as in this stackexchange would make it easier to recover if it's taking too long. As it is my R session froze and I had to kill Rstudio and start over.

These issues extend from the RNiftyReg package, for which I have opened an issue requesting an interrupt function. There have also been previous observations that functions in this package do not exit cleanly, but causes and solutions unfortunately lie beyond my control in the current package.

What i have nevertheless done is to implement an automatic reduction in size of large images, as detailed in issue #15. I have fixed the size to ensure files submitted to RNiftyReg are <1MB in size, although there could be more flexible ways to do this? Nevertheless, as noted there, RNiftyReg routines are generally designed to only work with images up to a maximum of 2,048 pixels in any one dimension, so this seems a safe approach. I'd welcome any further thoughts or suggestsions @potterzot ...

This may be a GB/US difference. Throughout the package the use of the word "scan" to mean "reading an image file into R" is confusing. In the U.S., most users will take this to mean actually using a scanner. For example, the documentation for ms_rectify_maps() reads "Scan in two pdf or png maps, rectify them ...". But the function isn't scanning anything. This happens in other places such as the README. I would suggest working on package and individual function descriptions to clarify that the package deals with images that have been scanned in but doesn't do any scanning. I.e. the package allows users to converted scanned images into sf objects, but doesn't operate the scanner. In the US "read" would probably best replace "scan".

As far as I could tell, there was only that one mentioned reference to the package itself doing any "scanning". I have now modified that to say that the function can be used to, "Rectify two previously scanned-in pdf or png maps with RNiftyReg" (see this issue). The word "scan" is indeed used to imply actually using a scanner (or camera, or other equipment).

Unexported functions would benefit from comments and/or documentation to explain their purpose. This helps contributors or yourself a year from now. E.g. p_paste and sf_df.

Ping co-author @mdsumner - could you please PR a fix there?

It looks like roxygen documentation is being done in a mix of markdown and the default, e.g. type = "polygons" doesn't render nicely in the help file. Instead of backticks, \code{} will render correctly. Alternatively, you can set roxygen to use markdown formatting by default. If you choose that route, then I think default roxygen styling like \pkg{sf} would have to be changed.

Sorry, I neglected to include the essential Roxygen: list(markdown = TRUE) in the DESCRIPTION. Now fixed in this commit. (And note that \pkg is still necessary to auto-insert the appropriate link to docs of nominated package.)

The note referenced in "see Note" in the ms_rectify_maps() documentation should be under Details. Right now it's included under the Value header.

...

"Automatically save" should be "saved".

Both of those fixed in this commit - thanks for noting!

I was initially confused where the map was stored until I read the second line of the mapname parameter documentation. It would be helpful to add to the function description that it saves the map files in the working directory or optionally in the full path name.

Done in this commit, so that the initial description of ms_generate_maps() now states that a map is, "automatically saved in both .pdf and .png formats, by default in current working directory, or alternative location when mapname includes the full path."

Code Style

The use of a space after the function name, e.g. library (mapscanner) doesn't conform to the rOpensci code style, which is also Hadley's style guide.

Yeah, but it also does not not conform. Like i always say, "white space let's code breathe". I also explicitly mention my white space preferences in CONTRIBUTING.md.

Moreover, in some places there is a space and in others there isn't.

I think I have found and rectified all previous non-spaced instances.

Similarly, if() functions should include {} if they extend to the next line

That's just another personal preference of mine to exclude {} for single-line clauses, but I do ensure that these have free lines above and below to ensure ease of reading. This reduces total code length by 1 line for each of my single-line if clauses.

This is more of a matter of preference, but for functions from packages that are used repeatedly throughout a function, I prefer @importFrom rather than ::. The code is cleaner and easier to read. I would do this with sf and dplyr and possibly purrr.

Sorry @potterzot, but I have to disagree on that one, for reasons that have been very well established in C++. In short, using @importFrom generates potential namespace confusion, while :: remains the only way in R to definitively avoid that.

Code

this "tell Mike" statement should be expanded into an explicit message or removed entirely.

Done, thanks.

Reflecting this comment, handle setting of API key in an informative way.

Done as described at length above.

Line extraction

Quotes from @potterzot:

While the package is explicitly aieed at researchers, another use case would be hikers and runners who want to plan a route but don't want to click on an online map. This would let you draw the route, scan it in to the map, and have it uploaded as a shape. If it could be then converted to a GPX file that would be really neat.

followed by comment:

Adding line extraction would be really neat, but I also don't think it needs to be done as part of the review process. Maybe a later feature down the road.

I will almost certainly defer this until the end of the review process, as extracting continuous lines from raster data is, as I indicated in my previous comment, exceedingly difficult.

khondula commented 5 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 10


Review Comments

General

This is a great package with many potential uses and I'm excited to see it being developed! Unfortunately I was not yet able to get the main functionality working using a scanned in document, as every attempt seems to stall out at the "converting to spatial format" step (details below). I look forward to guidance from the authors to help diagnose my issues and get it working. Assuming these can be resolved, my only other concerns are some areas where the documentation could be improved, and a question about whether there is a need for a mapbox account.

Specific comments

R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
library(rosm)
library(sp)
library(sf)
library(dplyr)

northcaro = system.file("shape/nc.shp", package="sf")
nc_sf <- st_read(northcaro)
myextent <- sp::bbox(as(nc_sf, "Spatial"))
osm1 <- osm.raster(myextent)
raster::plotRGB(osm1)
Error in get_mapbox_token() : 
  Map generation requires a mapbox API key to be set with Sys.setenv with a name that includes either the strings 'mapbox' or 'mapscanner'
In addition: Warning message:
In if (length(tok) != 1 | tok == "") stop("Map generation requires a mapbox API key to be set with ",  :
  the condition has length > 1 and only the first element will be used

Documentation

From readme:

From vignette:

Other notes

In file included from concaveman.cpp:1:
./concaveman.h:587:38: warning: unused typedef 'circ_elem_type' [-Wunused-local-typedef]
    typedef CircularElement<Node<T>> circ_elem_type;
                                     ^
1 warning generated.
Warning: invalid uid value replaced by that for user 'nobody'
Warning: invalid gid value replaced by that for user 'nobody'
In file included from concaveman.cpp:1:
./concaveman.h:587:38: warning: unused typedef 'circ_elem_type' [-Wunused-local-typedef]
    typedef CircularElement<Node<T>> circ_elem_type;
                                     ^
1 warning generated.
Loading mapscanner
Error in getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE) : 
  must specify DLL via a “DLLInfo” object. See getLoadedDLLs()
sckott commented 5 years ago

thanks so much @khondula for your review! 🙏

@mpadge all reviews are in now ...

mpadge commented 5 years ago

@khondula Before I provide a full response, I wonder if you would be so kind as to send me copies of your failing maps? I am quite concerned that both reviewers have encountered fundamental failures in package functionality, and would myself be loathe to accept a package regardless of all other positive sentiments if I can't get the basic functionality actually working! :smirk: If they are not too big, you could maybe just drag them into this issue, and they could be deleted again once I've got them? Or maybe email would be easier - check my gh profile for details. Thanks!

mpadge commented 5 years ago

So @khondula very kindly sent me all of the maps mentioned above. All of them were complete breaking cases, and addressing them has helped enormously to make the package much more robust. Lots of the internal mechanisms have been completely reconceived. The package now makes an initial guess at number of items to be extracted, then estimates a signal-to-noise threshold to optimally identify those features. There is also an additional optional parameter nitems which, if specified, overrides the automatic estimation, and can improve results for difficult cases. The following table summarises what now happens for each of the previously failing files:

file result?
choptank-annotated.pdf ok
choptank-annotated.png ok
choptank-annotate2.png ok
choptank-annotate3.pdf ok
choptank-annotate4.pdf ok
choptank-annotate-5.pdf auto-detects only 5 items and kinda fails on one, but works perfectly with nitems = 6
choptank-annotate-line.png works perfectly as polygon (but actual line-detection not yet implemented; see #5)
choptank-annotate-points.png works perfectly as points, hulls, and polygons
omaha-pink2.pdf auto-detects only 1 item but then errors appropriately; specifying nitems = 2 works
omaha-pink.pdf works, but inaccurately because fails to trim border

@khondula could you please confirm that all is okay? And if so, it'd be nice if you could close the corresponding issue on the package repo itself. Thank you so much for helping to greatly improve the package already!


Edit: Once you've confirmed that you are now able to get the package working, I'll respond to the rest of your review straight away :smile:

sckott commented 5 years ago

@khondula can you respond to @mpadge ^^

khondula commented 5 years ago

Hi @mpadge, your updates seem to have worked! All cases were able to convert to spatial format, and the messages regarding nitems and ms_rotate_map were very helpful. I will close issue on package repo.

One new thing I noticed however is that with type = "hulls" (not what I intended but discovered because it's the default!), the output objects seem to be rotated 90 degrees - I think this might be happening here?

Using the images from the example as f_orig and f_mod:

res_hulls <- ms_rectify_maps(f_orig, f_mod, type = "hulls", nitems = 2)
res_polygons <- ms_rectify_maps(f_orig, f_mod, type = "polygons", nitems = 2)

res_hulls is rotated:

res_hulls %>%
  leaflet() %>%
  addTiles() %>%
  addPolygons()
Screen Shot 2019-10-09 at 11 08 48 PM

res_polygons is fine though

res_polygons %>%
  leaflet() %>%
  addTiles() %>%
  addPolygons()
Screen Shot 2019-10-09 at 11 09 02 PM
mpadge commented 5 years ago

Thanks so much for checking @khondula - that's great news that it all works. And sorry about the rotation - it's something I only just discovered and am working on right now. I'd spent so much time on type = "polygons" that I had neglected to ensure "hulls" were also appropriately rotated. Should be fixed very soon ... I'll give a full response to your review asap.


Edit: The rotation issues fixed in the commit referenced in this issue

mpadge commented 5 years ago

Response to review by @khondula

Mostly in the form of quotations of review comments, followed by my own comments and descriptions of responses and associated improvements to the package.

General

This is a great package with many potential uses and I'm excited to see it being developed!

Thanks for the encouraging start - I'm excited about having developed it, and the ongoing improvements brought about by this review process.

Unfortunately I was not yet able to get the main functionality working using a scanned in document, as every attempt seems to stall out at the "converting to spatial format" step (details below). I look forward to guidance from the authors to help diagnose my issues and get it working. Assuming these can be resolved ...

Hopefully all resolved via the above comments in this review, and in issue#17 in the package repo.

... my only other concerns are some areas where the documentation could be improved, and a question about whether there is a need for a mapbox account.

Responded to in detail below.

Specific comments

Would it be possible to return a message if 'set mapbox token' is completed successfully?

Messages on success or fail of set_mapbox_token() added in this commit.

I also prefer not having a space between function names and opening parentheses, but defer to author preference.

Respectfully noted, but i like white space. The style is noted in CONTRIBUTING.md, and exists simply because it is easier to read

this <- function1 (function2 (x))

than

this <- function1(function2(x))

It was unclear to me what the MapBox token is used for. If it is only necessary for acquiring tiles as a rasterbrick, would it be possible to use the rosm package as below, which I don't believe requires signing up for any external account?

This would be a good idea but for two issues which I believe prevent it being able to be implemented:

  1. The rosm package 'will probably be retired in the next year' (comment added on Jan 16, 2019).
  2. That package avoids the use of an API key through using the third-party service a.basemaps.cartocdn.com, which offers only a restricted set of three map styles, none of which are particularly suitable for creating black-and-white maps needed by this package.

Of course it would be a great convenience boost not to require an API key, but at present I see no alternative alas.

I had trouble set the mapbox key as an environment variable in my ~/.Renviron file, error copied below. I did not have any problems using the set_mapbox_token method.

Error in get_mapbox_token() : 
  Map generation requires a mapbox API key to be set with Sys.setenv with a name that includes either the strings 'mapbox' or 'mapscanner'
In addition: Warning message:
In if (length(tok) != 1 | tok == "") stop("Map generation requires a mapbox API key to be set with ",  :
  the condition has length > 1 and only the first element will be used

This was shoddy programming on my part, and has been fixed as detailed in issue #23.

When the modified/annotated map includes polygons but the type argument is "points", would it be possible to include more helpful error message, such as a suggestion to "try type = hull"?

> res <- ms_rectify_maps(f_orig, f_mod, type = "points")
══ mapscanner ══════════════════════════════════════════════════════════════════════════
✔ rectifying the two maps 
✔ extracting drawn objects 
❯ converting to spatial format Error in CPL_geos_op("centroid", x, numeric(0), integer(0), numeric(0),  : 
  Evaluation error: IllegalArgumentException: point array must contain 0 or >1 elements.

I suspect this would have reflected the generally processing workflow that previously failed on most of @khondula's tests. The work undertaken to ensure that these all work should also have ensured that errors like this do not arise.

Documentation

From readme:

"generally be the most convenient for printing, while the rectification itself requires .png-format images" - My understanding is that the conversion to png happens anyway if needed, so this could be taken out of the general intro and only left in the more detailed info.

That comment removed in this commit

For the initial example, is it potentially confusing to have the main function call wrapped in system.time, since that is not necessary for running the function?

Thanks for finding that - the system.time call has been removed in this commit.

"Finally, we can plot the result as an interactive map using packages like mapdeck, or mapview:" Would it be possible to show an example of this code for users unfamiliar with those packages?

Good idea - code added in this commit.

I agree with the statements about the use of paper maps, however wouldn't there also be use cases where digital annotations (e.g. with a stylus, finger, pdf editor) on a phone/tablet/other device could be saved and used with this package? If so, perhaps some language could be included to clarify that is possible as well.

That is of course very important. I had been so busy conceiving of the package as a tool to aid drawing on actual paper that i hadn't even sufficiently considered how direct the transfer is to screen devices. The text of the README has accordingly been updated in this commit, with corresponding changes in the vignette made in this commit.

It is unclear what is meant by a "practical workflow". The current language under 'practical tips' seems to imply that other software packages are meant to not be practical. Would it be possible to either clarify, or leave out the phrase about 'unlike most software packages...'? I appreciate that there is a specific workflow/use case in mind, but it is also clear that there are many more potential applications!

Yes, a very good point. I have simply removed that phrase, so that section now starts, "The mapscanner package is intended to aid a practical workflow, and so a few practical tips may be recommended here ..."

From vignette:

"enables maps to be printed out" - The wording could be more precise here. I believe it would it be more accurate to say something along the lines of "enables printed out maps to be drawn on and..." since the functions aren't actually doing the printing.

Changed to, "enables lines drawn by hand on maps to be converted to spatial objects," with references to printing only hypothetical for paper-based use.

Is it possible to add more details to clarify the distinction between type = hulls vs type = polygons?

Description extended within the vignette in this commit.

Other notes

One warning during local installation:

In file included from concaveman.cpp:1:
./concaveman.h:587:38: warning: unused typedef 'circ_elem_type' [-Wunused-local-typedef]
    typedef CircularElement<Node<T>> circ_elem_type;
                                     ^
1 warning generated.

That warning is because of C++14 interpretation of a C++11-standard usage, and depends on your local compiler, in particular any flags you may have set -- or indeed not set -- in ~.R/Makevars. It's a strictly local issue, not one associated with the package itself.

Warning from github install:

Warning: invalid uid value replaced by that for user 'nobody'
Warning: invalid gid value replaced by that for user 'nobody'

This issue is described in this R-devel mailing list post, and this StackOverflow Q. It can safely be ignored.

Error in devtools::check()

Loading mapscanner
Error in getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE) : 
  must specify DLL via a “DLLInfo” object. See getLoadedDLLs()

I guess you're checking on windows? If that problem persists, the usual approach would be to confirm that everything is okay with

R CMD build mapscanner
R CMD check mapscanner_0.X.X.tar.gz

I don't have access to a windows machine to be able to replicate this. @sckott do you perchance have access to such? And if so, would you perhaps be able to run a quick devtools::check() on the repo to confirm or otherwise whether this persists?


concluding comments

Thank you very much @khondula for a very constructive review, and in particular thank you for privately sending me all of your test maps which previously failed, and which led to me finding quite a number of holes and glitches in former code. Fixing all of those has been the single greatest improvement to package functionality since its inception. I remain hugely indebted, and extremely grateful!

sckott commented 5 years ago

thx for your detailed responses @mpadge

@potterzot and @khondula are you happy with the responses to your reviews? any other comments?

potterzot commented 5 years ago

I am happy to confirm being able to use mapscanner based on the documentation provided in the updated README and vignette. Unfortunately one of the tests isn't passing:

test-rectify.R:46: error: rectify
'from' must be a finite number
1: expect_silent(res_p <- ms_rectify_maps(f_orig2, f_modified2, type = "polygons", quiet = TRUE)) at /home/potterzot/reason/projects/ropensci/mapscanner/tests/testthat/test-rectify.R:46
2: quasi_capture(enquo(object), NULL, evaluate_promise)
3: .capture(act$val <- eval_bare(get_expr(.quo), get_env(.quo)), ...)
4: withr::with_output_sink(temp, withCallingHandlers(withVisible(code), warning = handle_warning, 
       message = handle_message))
5: force(code)
6: withCallingHandlers(withVisible(code), warning = handle_warning, message = handle_message)
7: withVisible(code)
8: eval_bare(get_expr(.quo), get_env(.quo))
9: ms_rectify_maps(f_orig2, f_modified2, type = "polygons", quiet = TRUE)
10: rectify_channel(img, f_orig, type = type, n = downsample, concavity = concavity, 
       length_threshold = length_threshold, quiet = quiet) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:121
11: polygon_boundaries(channel) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:358
12: get_shape_index(cmat_i, x = TRUE) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:409
13: seq(min(index) - 1, max(index) + 1) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:428
14: seq.default(min(index) - 1, max(index) + 1)
15: stop("'from' must be a finite number")
mpadge commented 5 years ago

Thanks @potterzot. I am unfortunately unable to reproduce that failing test - could you maybe try re-installing and ensure that you get

devtools::load_all (".", export_all = FALSE)
#> Loading mapscanner
packageVersion("mapscanner")
#> [1] '0.0.1.5'

Created on 2019-10-23 by the reprex package (v0.3.0)

All tests current pass on my local machine, on travis, and on appveyor.

khondula commented 5 years ago

All looks good to me - changes work successfully and the devtools::check() error has gone away. Thanks for this package @mpadge !

sckott commented 5 years ago

Thanks reviewers!

I'll do one last check ...

potterzot commented 5 years ago

I've updated all of my R packages, started with a fresh git clone, and am still getting that error and am unable to build the package. I'm not sure why, but given that I'm the only one, this seems like an error on my part. Documenting those issues here in case there is some build difference that is relevant. Or if there's something I'm doing wrong please let me know!

I've updated my review to suggest accepting, great improvements and good package @mpadge!

One suggestion that I have after trying to debug:

Running `R -e 'devtools::test()' produces these errors:

...
test-file-manip.R:14: failure: file manipulation
`ms_rotate_map(f_orig, f_modified)` produced messages.
...
test-rectify.R:50: error: rectify
'from' must be a finite number
1: expect_silent(res_p <- ms_rectify_maps(f_orig2, f_modified2, type = "polygons", quiet = TRUE)) at /home/potterzot/reason/projects/ropensci/mapscanner/tests/testthat/test-rectify.R:50
2: quasi_capture(enquo(object), NULL, evaluate_promise)
3: .capture(act$val <- eval_bare(get_expr(.quo), get_env(.quo)), ...)
4: withr::with_output_sink(temp, withCallingHandlers(withVisible(code), warning = handle_warning, 
       message = handle_message))
5: force(code)
6: withCallingHandlers(withVisible(code), warning = handle_warning, message = handle_message)
7: withVisible(code)
8: eval_bare(get_expr(.quo), get_env(.quo))
9: ms_rectify_maps(f_orig2, f_modified2, type = "polygons", quiet = TRUE)
10: rectify_channel(img, f_orig, type = type, n = downsample, concavity = concavity, 
       length_threshold = length_threshold, quiet = quiet) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:121
11: polygon_boundaries(channel) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:358
12: get_shape_index(cmat_i, x = TRUE) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:409
13: seq(min(index) - 1, max(index) + 1) at /home/potterzot/reason/projects/ropensci/mapscanner/R/rectify-maps.R:428
14: seq.default(min(index) - 1, max(index) + 1)
15: stop("'from' must be a finite number")

Running R CMD build . or R -e 'devtools::build()' produces the following error:

--- re-building ‘mapscanner.Rmd’ using rmarkdown
   Loading required namespace: sf
   File mapshot-polys.png not found in resource path
   Error: processing vignette 'mapscanner.Rmd' failed with diagnostics:
   pandoc document conversion failed with error 99
   --- failed re-building ‘mapscanner.Rmd’
mpadge commented 5 years ago

@potterzot I'm so sorry about your continued troubles there. The commit referenced immediately above solves your latter problem of not being able to install. About your other issue from devtools::test(), i can not unfortunately do much. I have managed to generate (very) occasional problematic behaviour using this command, which gives in my case the very uninformative message,

#> Error in if (length(xx) > 0) { : argument is of length zero

This error is not issued by mapscanner (grep -r "xx" . returns nothing - there is no "xx" in the code), so presumably arises from the depths of devtools::test() itself. Do all tests run okay for you when you run devtools::check()? Or R CMD check?

potterzot commented 5 years ago

I spent some time debugging. The tests in test-rectify.R fail because they lack a "nitems = 2" parameter call and the internal code finds the number of components to be zero. I think that this is due to lines 284-287 of R/rectify-maps.R not allowing for the case in which all values of dn are less than or equal to 0, and as a result thr0 ends up as NA if dn has some 0 values and some negative values. Anyway, if I change line 284 to be <=, the tests pass. I've submitted a PR.

As to why this works fine on travis but was failing on my laptop I can't say. The only thing I could think of was perhaps cpp compilers are different but that seems like an obscure error. My Makevars is completely empty for building this, so I shouldn't have any weird tweaks.

A second issue

In the process I realized the vignette contains a lot of hardcoded outputs. I would suggest making these outputs the actual output from the code block. This bit me because when trying to build the vignette I thought it was reporting the actual output of the code block but it wasn't.

mpadge commented 5 years ago

Thanks @potterzot for the PR fix. That's a great catch that took me a while to understand, but yeah, you're absolutely right that the code without that would have errored whenever all(dn == 0), and that's exactly the fix required. Nice.

As for the second issue, most of the vignette runs except for the bits relating to maps of Chennai. These unfortunately have to have fake output because it just takes too long to compile otherwise, and would almost certainly take too long for CRAN checks. The only other bits that use echo=FALSE are to generate the mapshot images of mapview maps, because these generate local image files on the first run, and then just use these from that point on, to avoid re-generating each time. I think that's okay, but please feel free to suggest any specific alternatives.

potterzot commented 5 years ago

Im happy to recommend this package at this point. Thanks for your work @mpadge!

On Sat, Oct 26, 2019, 08:08 mark padgham notifications@github.com wrote:

Thanks @potterzot https://github.com/potterzot for the PR fix. That's a great catch that took me a while to understand, but yeah, you're absolutely right that the code without that would have errored whenever all(dn == 0), and that's exactly the fix required. Nice.

As for the second issue, most of the vignette runs except for the bits relating to maps of Chennai. These unfortunately have to have fake output because it just takes too long to compile otherwise, and would almost certainly take too long for CRAN checks. The only other bits that use echo=FALSE are to generate the mapshot images of mapview maps, because these generate local image files on the first run, and then just use these from that point on, to avoid re-generating each time. I think that's okay, but please feel free to suggest any specific alternatives.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/330?email_source=notifications&email_token=AADUQ3WFYUGAFRHYAYWWHCLQQRMNBA5CNFSM4IG2KZH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECKJ6YA#issuecomment-546611040, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADUQ3XXT4XMJ4UUIMMNFBDQQRMNBANCNFSM4IG2KZHQ .

sckott commented 5 years ago

Approved! Thanks @mpadge for submitting and @khondula and @potterzot for your reviews!

A few comments:

To-dos:

For JOSS:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

mpadge commented 5 years ago

Thanks @sckott, that is indeed fabulous news. Before proceeding any further, however, I've opened an issue in the repo on something that's been bugging me, and which i want to solve before finishing up here - and which you yourself raised:

on check, there's a package size warning on my macos that it's 5.2 mb, maybe compressing some images may bring that below 5?

It's not the images, it's ggplot2. In particular, the three figures in the vignette which illustrate the peripheral yet (in our opinions) rather cool feature of polygon aggregation. ggplot2 simply results in huge figure sizes here, and it seems nothing can be done about that, other than to maybe switch to base graphics. But a solution is needed, because there is already another vignette demonstrating the different mapstyles available, and the most important one in the pipeline to arise from a large sociological study currently being conducted with the package ... but neither of those vignettes can actually be included with the package until we can reduce the package size, which it seems can't be done with ggplot2 being used. Any and all assistance greatly appreciated, even if only advice to ditch gg and revert to good old base.

potterzot commented 5 years ago

@mpadge as far as the ggplot size issue goes, one option may be to use the dpi, height, and width parameters of ggsave() to save the images and then read them in, rather than displaying them directly. I think this would give you control over the file size, but still allow dynamic generation of the figures.

sckott commented 5 years ago

sounds good, thanks @mpadge

mpadge commented 4 years ago

@potterzot @khondula @sckott Finally back on to this :rocket: I've done my best to reduce the figure sizes, tried every and all permutation possible, and have now got it down to what I consider a sufficiently small size (always < 3.5MB installed size on all machines I've checked on). Happy to ... finally and hopefully ... progress to final stages now. Thanks for bearing with me there :smile:

sckott commented 4 years ago

thanks @mpadge

did another check, all looks good to me.

any other comments @khondula or @potterzot ?

potterzot commented 4 years ago

All good for me, thanks @mpadge for your work, and congrats!

sckott commented 4 years ago

@mpadge i think we're all good, i don't think we'll hear from kelly

khondula commented 4 years ago

all good from my end!

sckott commented 4 years ago

ugh, 😢 - foot in mouth - thanks @khondula !

sckott commented 4 years ago

@mpadge are we all set? anything left to do?

mpadge commented 4 years ago

Yeah, I still have to do the paper.md for JOSS. I've been waiting for some empirical results from a currently-running study, but that's not forthcoming, so I'll write an alternative one instead. I'll let you know as soon as that's ready, and then I think we'll be done. Hopefully in just a few days time.

mpadge commented 4 years ago

Okay @sckott, thanks for your patience, and now all done I think, including just updating the last check item in the initial list to enable DOI. Good to go?

sckott commented 4 years ago

yes, good to go!

mpadge commented 4 years ago

@khondula @potterzot Thank you both so much for your very constructive input into the package, which has finally been officially onboarded and transferred to rOpenSci. I'd appreciate it if you'd both be so kind as to consent to being added as reviewers in the package DESCRIPTION file - just a :+1: will do the job, or just let me know if you'd rather not. @sckott Other than that, most of the above item attended to. I'll paste you full list of item and detail everything as soon as that last step is done. Thanks all!

mpadge commented 4 years ago

@sckott Thanks for all of the much appreciated editorial assistance with this package. I'm very excited that it's finally crossed the threshold over to rOpenSci. Here's you checklist of items again to explicitly confirm that I've attended to all points:

That function no longer works, but I gave due credit to both reviewers in the DESCRIPTION.

For JOSS:

Last step not currently possible "Due to Walking Dead World's End" (as sign on a bar in my town helpfully declares).


Other than that, it seems to me that that should be sufficient to consider it successfully onboarded, right? :tada: :man_dancing: Dunno what the status of blog entries and suchlike is in the midst of global restructuring, but whatever can be worked out whenever it might be able to be, i'll be part of it. Thanks!

potterzot commented 4 years ago

Sorry I didn't respond, very happy to have my name included, and congratulations!

On Thu, Mar 19, 2020, 07:15 mark padgham notifications@github.com wrote:

@khondula https://github.com/khondula @potterzot https://github.com/potterzot Thank you both so much for your very constructive input into the package, which has finally been officially onboarded and transferred to rOpenSci. I'd appreciate it if you'd both be so kind as to consent to being added as reviewers in the package DESCRIPTION file - just a 👍 will do the job, or just let me know if you'd rather not. @sckott https://github.com/sckott Other than that, most of the above item attended to. I'll paste you full list of item and detail everything as soon as that last step is done. Thanks all!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/330#issuecomment-601203184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADUQ3XTLSTPM5SQBUQSK2TRIISJTANCNFSM4IG2KZHQ .

mpadge commented 4 years ago

Yeah, I slightly jumped the gun there and added your name to the DESC before you'd consented - thanks!!

sckott commented 4 years ago

it seems to me that that should be sufficient to consider it successfully onboarded, right?

Yes!

Get the JOSS submission done when you can (after they open back up)

@stefaniebutland i think this package would make a nice blog post. how should mark proceed?

stefaniebutland commented 4 years ago

Sounds good. @mpadge please see our new rOpenSci Blog Guidelines for Authors and Editors for updated instructions for a blog post or tech note.

annakrystalli commented 4 years ago

⚠️⚠️⚠️⚠️⚠️ In the interest of reducing load on reviewers and editors as we manage the
COVID-19 crisis, rOpenSci new submissions for software peer review are paused.

In this period new submissions will not be handled, nor new reviewers assigned.
Reviews and responses to reviews will be handled on a 'best effort' basis, but
no follow-up reminders will be sent. Other rOpenSci community activities continue.

Please check back here again after 25 May when we will be announcing plans to slowly start back up.

We express our continued great
appreciation for the work of our authors and reviewers. Stay healthy and take
care of one other.

The rOpenSci Editorial Board ⚠️⚠️⚠️⚠️⚠️