ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

osmplotr onboarding request #27

Closed mpadge closed 8 years ago

mpadge commented 8 years ago
Produces customisable images of OpenStreetMap data.  Extracts OpenStreetMap data for specified key-value pairs (e.g.  key="building") using the overpass API. Different OSM objects can be plotted in different colours using the function add_osm_objects().  The function group_osm_objects() enables customised highlighting of selected regions using different graphical schemes designed to contrast with surrounding backgrounds.

[added by @richfitz:

Package: osmplotr
Title: Customisable Images of OpenStreetMap Data
Version: 0.1-2
Date: 2016-03-01
Authors@R: person("Mark", "Padgham", email = "mark.padgham@email.com", role = c("aut", "cre"))
Description: Produces customisable images of OpenStreetMap data.  Extracts OpenStreetMap data for specified key-value pairs (e.g.  key="building") using the overpass API. Different OSM objects can be plotted in different colours using the function add_osm_objects().  The function group_osm_objects() enables customised highlighting of selected regions using different graphical schemes designed to contrast with surrounding backgrounds.
Depends: R (>= 3.2.3)
Imports: 
    ggm,
    igraph,
    httr,
    methods,
    osmar, 
    rgeos,
    sp, 
    spatstat, 
    XML,
Suggests: knitr,
    roxygen2,
    rmarkdown,
    devtools,
    maptools,
    RColorBrewer
License: GPL-3
URL: https://github.com/mpadge/osmplotr
LazyData: true
VignetteBuilder: knitr

]

In short: This is the only package that allows OpenStreetMap data to be presented in a visually customisable way.

sckott commented 8 years ago

Reviewers: @jhollist

sckott commented 8 years ago

Thanks for your submission @mpadge ! We've assigned a reviewer and we'll get back to you soon with reviews and continue the discussion from there

sckott commented 8 years ago

@jhollist - hey there, it's been 17 days, please get your review in by Mar 26, thanks :smiley_cat:

mpadge commented 8 years ago

@jhollist Sorry if the constant changes throughout the review period have made your job more difficult. Should pretty much remain as is from here on. Thanks @sckott

jhollist commented 8 years ago

@mpadge I have not done too much yet with the review. Getting ready to start on it right now so the changes haven't been a problem. I hope to have review posted in a day or two.

jhollist commented 8 years ago

@mpadge and @sckott Working through the review, but other stuff has gotten in the way this week. I am about 75% done and will have it posted early next week.

sckott commented 8 years ago

@jhollist - hey there, it's been 21 days, please get your review in soon, thanks :smiley_cat: (ropensci-bot)

sckott commented 8 years ago

(apologies jeff for the 2nd ping, working on making the bot more humane, e.g., we probably shouldn't ping right after you had discussion about almost being done)

jhollist commented 8 years ago

No worries!

On Fri, Mar 25, 2016 at 9:36 PM, Scott Chamberlain <notifications@github.com

wrote:

(apologies jeff for the 2nd ping, working on making the bot more humane, e.g., we probably shouldn't ping right after you had discussion about almost being done)

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ropensci/onboarding/issues/27#issuecomment-201652399

Jeff W. Hollister email: jeff.w.hollister@gmail.com google voice: 401 326 2531 cell: 401 556 4087

jhollist commented 8 years ago

At long last, here is my review @mpadge!

General comments:

The osmplotr package provides a suite of tools for accessing OpenStreetMap data via the Overpass API and visualizing the resultant data. This is accomplished with some wrappers to the osmar package as well as functions for plotting the resultant spobjects. Kudos on simplifying the interface to osmar (at least it simplified it for the uninitiated (aka. Me!)). This is a really nice addition. Also nice work on providing some useful tools for visualizing the data. I do think this package represents a lot of nice work and will be a useful addition, but it is not yet ready for onboarding for the following reasons:

In addition, I had some other general comments to think about.

Installation:

rOpenSci Package Guidelines:

The comments below refer directly to the rOpenSci package guidelines. I've made suggested changes where appropriate.

Functions:

I've gone through each of the functions below and provided comments on documentation, functionality, ease of use, etc. Where possible I have made comments about the code itself, but given the size of the code base in osmplotr I was unfortunately not able to spend too much time digging into the specifics.

Vignettes:

Estimated Hours

~10

sckott commented 8 years ago

Thanks for the review @jhollist !

jhollist commented 8 years ago

No problem! And if you have any questions @mpadge, just let me know.

mpadge commented 8 years ago

Thanks a lot for the impressively detailed and extremely helpful review @jhollist . I agree with pretty much every issue you raised, and will attend to them all in a revised version asap. Before doing that, I'd appreciate it if you could let me know what you think about one general question regarding the abiding purpose of the package. I think many of the issues you picked up on reflect the fact that I initially designed the package to provide a simple way to plot OSM data, but modified this purpose along the way to provide data visualisation tools (primarily with add_osm_groups and add_osm_surface). In my mind, the package provides two (admittedly potentially confusing) functions: (i) drawing custom maps, and (ii) data visualisation.

You claim that the ability

to highlight selected areas in a different color than the background ... could [be recreated] fairly easily by subsetting areas and plotting those separately with base or ggplot2,

and that you

wouldn't tout this as much.

This is maybe your only claim with which I disagree. I think the add_osm_groups function really does provide a much simpler way to achieve this end result that would manual sub-setting and separate plotting. Moreover, I think the additional add_osm_surface function provides additional ability that would be much more difficult to do any other way.

This latter function was added after writing the vignettes. Please accept my apology for not including it there, but the reason was that I envision re-structuring the vignettes, and thought it better to wait for initial comments before doing that. I propose to ditch the downloading_data vignette as you suggest, and to provide two for the respective purposes of (i) generating custom maps, and (ii) data visualisation, with the second focussing entirely on add_osm_groups and add_osm_surface. Before i head off in this direction, any general comments you might have are likely to be very useful and much appreciated in advance...


Your comments also helped me realise how I might overcome what I see as one big issue in that the package relies on the admittedly unorthodox requirement of manually closing devices at the end of map construction. This could easily be avoided by constructing maps as ggplot objects, however i intentionally avoided this because the package primarily offers the ability to construct maps in discrete sequences, and this kind of discrete sequential construction of graphical objects is in my opinion handled in a far more intuitive and elegant way with graphics. (Consider, for example, that using ggplot to add a layer of data points bearing no relationship with previous data used to generate a map is not exactly trivial, and certainly not intuitive for those not familiar with ggplot.)

Nevertheless, constructing ggplot objects would offer the dual advantage of (i) avoiding having to manually close the graphics objects, and (ii) being able to wrap potentially confusing ggplot code within simple add_-style functions, as is currently done anyway. So I reckon I'll try re-coding all graphics objects as ggplot and see where that takes me.

Note also that I avoided using classes in the code as it stands, because the only direct benefit would be in plot calls, but these only create 'empty' graphics objects anyway, so I don't really see much advantage. Having said that, this might nevertheless remove some of the confusion you highlighted regarding the nature of obj in add_osm_object. I'll have a think about that too...


I'll address all of the other issues you raised in the meantime, including tests, internalising functions, ensuring projections are kept throughout, reducing dependencies. I'll let you know when the code is ready for further perusal.

... and one final comment: add_osm_surface can be extremely slow, and this slowness is the direct result of the way osmar constructs sp objects. I'm working on a separate package to speed this up by writing a direct C++ interface to the overpass API, but this will not be part of this package. Until this is available, add_osm_surface will unfortunately just remain potentially slow. Alas.

/cc @sckott

sckott commented 8 years ago

@mpadge what's up? - regarding the cc

mpadge commented 8 years ago

@sckott just to let you know that i'm onto the revisions

jhollist commented 8 years ago

@mpadge Couple of quick thoughts for you.

  1. add_osm_groups and add_osm_surface: My biggest hang up with these was that I didn't immediately grasp the use case for them and that likely stemmed from lack of examples/vignette. Thus, your plan to focus on these in a new vignette is a good one. As long as you provide some solid use cases and show the value added for using existing tools, then my issues should be taken care of.
  2. ggplot2 and current set-up: I think a ggplot2 implementation would be nice, but if it were me I'd focus that effort on a future release. The dev.off() issue could be handled simply by removing the save to file from plot_osm_basemap() and use dev.copy() to save the plot. An example with iris:
plot(iris$Sepal.Length,iris$Petal.Width)
abline(lm(Petal.Width~Sepal.Length, data=iris))
dev.copy(png,"test.png",width=800)
dev.off()

A function that just provides an easy interface to dev.copy and shuts down that device should do the trick.

Look forward to seeing the revisions!

mpadge commented 8 years ago

Thanks again @jhollist for the extremely helpful review. Below you'll find my detailed responses to all of your concerns. But first a couple of general points for you to note regarding the changes to osmplotr:

  1. I went ahead with your suggestion to ggplot-ify everything, which is why it's taken me quite some time to get back to you. This has, in my opinion, greatly improved the entire functionality of the package (and had the additional, unexpected benefit of making it a lot faster too).
  2. There are now two vignettes named 'making-maps' and 'making-maps-with-data'. The first of these incorporates the former two, and the latter is entirely new, and hopefully provides a lot more clarity and motivation for the package. Aspects of your previous concerns seemed to reflect the fact that I initially built the package with the functionality of the first vignette in mind, and only subsequently extended it to accommodate the data-visualisation aspects covered in the second vignette. It was thus not previously clear exactly what the package was intended to do. Hopefully the new vignettes (along with detailed examples in every function) do the job!

General comments

  • tests: Prior to onboarding a test suite needs to be added.

Test suite added. It's not yet complete, but well on the way, and remains (in my mind) the only major task.

  • examples: Not all functions have examples. These will need to be added. Also, instead of entirely relying on example data installed with package for the examples, it would be nice to have examples that show downloading the data as well. This would make them self contained and show examples of a typical workflow. You could wrap in a dontrun and still use the data included with the package. A lot of this in the vignette, but would be nice to not have to go back and forth to the vignette.

All functions now have examples, and many of them include code for downloading data. Examples were a bit thin in the previous version because Uwe Ligges explicitly told me to remove the \dontrun wrappers, which meant the CRAN check took too long, so I simply removed the examples. Hopefully it'll be CRAN-acceptable now.

  • usability: I have two comments regarding usability. I am not sure these are concerns per se, but something to think about as you continue to develop the package. First, it feels like some of the functions you have exported might be better used as internal functions (see below for suggestions).

Done. Important improvement: thanks!

  • usability (continued) Second, your interface is a departure from the norms (i.e. base or ggplot2). Did you consider creating S3 objects from the OSM data and implementing the plot using a S3 plot method (e.g. something like plot.osmplotr()). With that a user could extract the data they want, then plot it directly and other layers could still be added with add_osm_object()... Another option would have been to implement the viz with ggplot2... Using a more standard interface and/or returning ggplot2 objects would increase the user base for the package.

The relatively long time taken for my response has been because I have rejigged everything to ggplot2. This has hugely improved all aspects of the package, notably including its speed when plotting large maps with lots of data. The previous 'departure from convention' was always my biggest concern, and I think the new ggplot2 implementation is a great solution with lots of additional bonuses that I hadn't anticipated in advance.

  • Make sure projection is included. This should be done for all functions that return sp objects. I re-tested this on some examples and the CRS was included. I forgot to document how I got an object without the CRS. I'll keep trying to reproduce and if I can't, then kudos for keeping this information!

Thanks for the tip! You were right: I had neglected to attach projections to objects returned from highways2polygon (now renamed connect_highways), which was likely what you discovered. Projections now specified and maintained for every single kind of object in all functions.

  • In the description of the package in the README you indicate that osmplotr is for use in Urban areas? Seems like it is useful wherever Open Street Map data is available. Increasingly, that is most places, correct? No need to sell yourself short!

Reference removed. Thanks.

  • License: GPL-3 - MIT preferred for rOpenSci, I believe.

I've not yet changed this for one reason in particular. Future development will require re-writing several ggplot2 functions because this package currently does not permit multiple colour gradients to be applied to a single plot. I will therefore need to rewrite several ggplot2 functions to make osmplotr even more flexible. The crunch: ggplot2 is GPL-2 and cannot simply be copied into an MIT license, so it would have to be at most a combined GPL-2/MIT. I'll await more detailed discussion before changing the license.

rOpenSci Package Guidelines:

  • Code of Conduct: Did not see a Code of Conduct. Be good to add one in. See packaging guide.

Done.

  • Testing with testthat: Currently no test suite. Prior to acceptance at least an initial set of tests should be added. Can be expanded in future versions.

Done exactly as suggested: It's not yet complete, but a good start.

  • Semantic versioning: Package is using semantic versioning. To be nitpicky, think about switching to major.minor.patch instead of major.minor-patch to be consistent with the rest of the rOpenSci packages.

Changed accordingly.

  • Examples: There are many examples, although I noticed a few functions that did not include them.

Done.

  • Package dependencies: As suggested, listed as imports or suggests. One thought I had while installing for the first time was that it seemed a bit heavy (i.e. lots of dependencies). If possible, might be nice to streamline the number of dependencies in future versions. For instance, using wesanderson in the examples is nice, but not necessary as base colors could be used instead.

I've done my best to remove as many dependencies as possible, and believe that it is now down to a minimal necessary number

Functions:

  • add_axes(): Examples work fine and use of function is straightforward. Could possibly shorten argument names by dropping "axis_", although not necessary.

Changed accordingly.

  • add_colourbar(): Example works fine, but given that you state in the description that the function is intended to be used with add_osm_surface() I would suggest a different example that shows the typical usage. Also, since this is intended to be used with add_osm_surface() perhaps it could be made an internal function and then called via arguments from add_osm_surface()? Lastly, function description and "side" param description should be updated since you have implemented plotting the colorbars along the different sides.

Example updated and greatly extended to hopefully make it far clearer and more relevant. It's still called add_colourbar because I think this is more consistent with the other add_ functions, but am open to suggestions for re-naming. It can't be made an internal function because to be useful it has to be overlaid only after all map items have been plotted. The new version also allows a lot of control over appearance, and adding all these parameters to add_osm_surface() would make that function a nightmare. The former side parameter has been changed to a binary vertical parameter.

  • add_osm_groups(): Needs examples. Couple of thoughts based on examples in vignette. First, why have the groups specified by points? The way this is getting used suggests polygons would be a better fit. I'd leave it to the user to get their group polygons in line first. This could let you drop some of the other arguments (i.e. the convex hull stuff) and simplify the function a bit. Second, the primary utility of this function is to add parts of the same OSM data. Given that this is a SpatialPolygonsDataFrame, I would think letting the user subset how they prefer and then adding the resultant subset via add_osm_object() would be better. One less function to maintain and learn! Or am I missing some of the primary use case?

Extensive example now given that hopefully clarifies the primary usage, but perhaps more importantly, a huge portion of new vignette ('making-maps-with-data') devoted to this function.

  • add_osm_object(): Examples work. I found the description of the "obj" parameter a bit confusing. It appears that obj can be SpatialPolygonsDataFrame, SpatialPointsDataFrame or SpatialLinesDataFrame but doesn't necessarily need to be a list (which is how I interpreted it) nor returned from extract_osm_objects() although that may be the typical use. Be more explicit in the description of the "obj" param.

I must admit I have not greatly changed the description of obj, because it does have to be an sp data frame object, and I thought, and still suspect, that's a fairly clear description. Any further advice welcome!

  • add_osm_surface(): Example worked, but as noted was slow. See above for suggested changes to the "obj" descriptions".

ggplot2 has made this function enormously faster! It'll still be a bit slow in interpolating complicated data surfaces, but that can't really be avoided. Nevertheless, I think the slowness of the former version would mostly have arisen in the plotting, and that's now orders of magnitude faster.

  • adjust_colours(): Example is perhaps a bit too minimal. Include a full working example (i.e. as in add_osm_surface()). Also provide a link (i.e. with \code{\link[packagename]{functioname}}) to other functions. For this function link to help on col2rgb().

All suggestions implemented. Previous versions did not have \code because Uwe Ligges told me to remove all \code usage for CRAN submission. I've now reinstated many of these, and hope he won't object this time.

  • click_map(): Add example and given that it is interactive wrap it in a \dontrun{}. I'd rethink function name. True you are clicking on the map, but what you are actually doing is getting a convex hull boundary. Maybe get_convex()? Perhaps this could also be an internal function that gets called where it is needed (i.e. add_osm_groups())?

The entire function has simply been ditched, because it was unlikely to have been particularly useful anyway.

  • colour_mat(): Need examples. With these examples, focus on a typical use case.

Extensive examples now provide use case and, hopefully, broader context.

  • connect_highways(), extract_highway(), extract_highways() ...

All purely internal functions now.

  • extract_osm_objects(): Needs examples. Lot of notes included in the comments of the code. Some of these looked like they should be added into the documentation as well (e.g., lines 102-109). Also, the examples in the vignettes show you using this function and extracting $obj from the resultant list. Requiring this further subsetting confuses to usability a bit. I'd prefer to see the warnings handled in a different way or have the other functions cull the $obj out. You could do this by utilizing S3 objects. Lastly, I am not getting any data in the data slot of the returned SPDF and I get the following error: "Error in as.matrix.data.frame(X) : dims [product 13068] do not match the length of object [13079]" when using raster::print (my preferred print method for sp objects) on the vignette example.

Extensive examples now included. Warnings now handled at the point of httr calls, and so function now returns the object directly. The error you got was a briefly-lived error on my part that has long since been resolved. Sorry about that!

  • get_bbox(): No real issues here, although you could easily get away with dropping this function entirely as sp::bbox() or osmar::corner_bbox() do very similar things. I'd just require output of one (or either) of those in the functions that require the bbox instead of rolling your own version.

This is a good suggestion, but not yet implemented because I was focussing on other things (ggplot-ing everyting). The point of get_bbox() is merely the ability to submit a simple vector, which can't be done with sp::bbox. (And I ultimately aim for osmplotr to be independent of osmar.) get_bbox is quite a bit more flexible--and therefore in my mind easier to use--than sp::bbox, but I'll think about dropping it in future versions.

  • highways2polygon(): Description indicates it will return SpatialLines but @return indicates a data.frame of lat-lon and the function name suggests it will return a polygon. Clean up description and/or @return. Also think about renaming the function as it doesn't currently return polys. Needs examples.

Renamed connect_highways, because that's really what it does, and examples provided including typical map examples to provide broader context.

  • make_osm_map(): Needs examples. Tried my own and failed. Line 86 (ns <- ...) should be set prior to the if() on line 63. As ns is called within that if statement.

Examples provided. Error fixed - again, please accept my apologies for that one!

  • order_lines()

Internalised.

  • osm_structures(): Example errors ...

Not any more.

  • osmplotr(): Ok, not a function, but I REALLY like this use of the package name to facilitate help. Would be a nice addition to link the function names to the help pages with \code{\link{}}.

See my response above to adjust_colours()

  • plot_osm_basemap(): Example works. Leaving the graphics device open and informing users in the function documentation to close the device doesn't seem like the ideal solution. I'd separate this functionality from plot_osm_basemap() and create another function that saves the current plot to file. Look at dev.copy() or think about using ggplot2 for the viz (see general comments for more on this).

... that last comment was the best of all your suggestions. Thank you for the enormous improvements this has made to almost all functionality of the package!

Vignettes:

  • Downloading data ... I think you could do without this vignette.

It's been incorporated within 'making-maps'.

  • Making maps: No problems with this vignette. I think it could possibly be re-named to "Introduction to osmplotr" or something along those lines and have it as the only vignette. Could also wrap in the info currently in the "Downloading Data" vignette at the end of this one if you want the download process for the example data to be captured. You also make a big deal out of being able to highlight selected areas in a different color than the background and that this is not available via other packages. That is not entirely true as you could recreate the plots fairly easily by subsetting areas and plotting those separately with base or ggplot2. I wouldn't tout this as much.

There are now two vignettes, 'making-maps' and 'making-maps-with-data'. The second of these focusses entirely on add_osm_groups() and add_osm_surface(). I hope that this second vignette far more convincingly explains the use of, and advance offered by, these functions. I do not believe that the plots illustrated in this second vignette could be recreated 'fairly easily ... with base or ggplot2'. As evidence I can only say that add_osm_groups() extends over nearly 450 lines of code! I think the ability to easily specify the locations, colours, and other properties of arbitrary numbers of groups is indeed not readily achievable with any other current package. Hopefully the vignettes have greatly clarified typical use and advance offered by osmplotr.

Looking forward to your response to this enormously improved version of osmplotr. Thanks again for the very helpful comments!

jhollist commented 8 years ago

Look forward to digging back into this. Swamped this week, but may have some time later next to take a look. Anxious to see the ggplot2 stuff!

On Wed, Apr 27, 2016 at 6:08 AM, mark padgham notifications@github.com wrote:

Thanks again @jhollist https://github.com/jhollist for the extremely helpful review. Below you'll find my detailed responses to all of your concerns. But first a couple of general points for you to note regarding the changes to osmplotr:

  1. I went ahead with your suggestion to ggplot-ify everything, which is why it's taken me quite some time to get back to you. This has, in my opinion, greatly improved the entire functionality of the package (and had the additional, unexpected benefit of making it a lot faster too).
  2. There are now two vignettes named 'making-maps' and 'making-maps-with-data'. The first of these incorporates the former two, and the latter is entirely new, and hopefully provides a lot more clarity and motivation for the package. Aspects of your previous concerns seemed to reflect the fact that I initially built the package with the functionality of the first vignette in mind, and only subsequently extended it to accommodate the data-visualisation aspects covered in the second vignette. It was thus not previously clear exactly what the package was intended to do. Hopefully the new vignettes (along with detailed examples in every function) do the job!

General comments

  • tests: Prior to onboarding a test suite needs to be added.

Test suite added. It's not yet complete, but well on the way, and remains (in my mind) the only major task.

  • examples: Not all functions have examples. These will need to be added. Also, instead of entirely relying on example data installed with package for the examples, it would be nice to have examples that show downloading the data as well. This would make them self contained and show examples of a typical workflow. You could wrap in a dontrun and still use the data included with the package. A lot of this in the vignette, but would be nice to not have to go back and forth to the vignette.

All functions now have examples, and many of them include code for downloading data. Examples were a bit thin in the previous version because Uwe Ligges explicitly told me to remove the \dontrun wrappers, which meant the CRAN check took too long, so I simply removed the examples. Hopefully it'll be CRAN-acceptable now.

  • usability: I have two comments regarding usability. I am not sure these are concerns per se, but something to think about as you continue to develop the package. First, it feels like some of the functions you have exported might be better used as internal functions (see below for suggestions).

Done. Important improvement: thanks!

  • usability (continued) Second, your interface is a departure from the norms (i.e. base or ggplot2). Did you consider creating S3 objects from the OSM data and implementing the plot using a S3 plot method (e.g. something like plot.osmplotr()). With that a user could extract the data they want, then plot it directly and other layers could still be added with add_osm_object()... Another option would have been to implement the viz with ggplot2... Using a more standard interface and/or returning ggplot2 objects would increase the user base for the package.

The relatively long time taken for my response has been because I have rejigged everything to ggplot2. This has hugely improved all aspects of the package, notably including its speed when plotting large maps with lots of data. The previous 'departure from convention' was always my biggest concern, and I think the new ggplot2 implementation is a great solution with lots of additional bonuses that I hadn't anticipated in advance.

  • Make sure projection is included. This should be done for all functions that return sp objects. I re-tested this on some examples and the CRS was included. I forgot to document how I got an object without the CRS. I'll keep trying to reproduce and if I can't, then kudos for keeping this information!

Thanks for the tip! You were right: I had neglected to attach projections to objects returned from highways2polygon (now renamed connect_highways), which was likely what you discovered. Projections now specified and maintained for every single kind of object in all functions.

  • In the description of the package in the README you indicate that osmplotr is for use in Urban areas? Seems like it is useful wherever Open Street Map data is available. Increasingly, that is most places, correct? No need to sell yourself short!

Reference removed. Thanks.

  • License: GPL-3 - MIT preferred for rOpenSci, I believe.

I've not yet changed this for one reason in particular. Future development will require re-writing several ggplot2 functions because this package currently does not permit multiple colour gradients to be applied to a single plot. I will therefore need to rewrite several ggplot2 functions to make osmplotr even more flexible. The crunch: ggplot2 is GPL-2 and cannot simply be copied into an MIT license, so it would have to be at most a combined GPL-2/MIT. I'll await more detailed discussion before changing the license. rOpenSci Package Guidelines:

  • Code of Conduct: Did not see a Code of Conduct. Be good to add one in. See packaging guide.

Done.

  • Testing with testthat: Currently no test suite. Prior to acceptance at least an initial set of tests should be added. Can be expanded in future versions.

Done exactly as suggested: It's not yet complete, but a good start.

  • Semantic versioning: Package is using semantic versioning. To be nitpicky, think about switching to major.minor.patch instead of major.minor-patch to be consistent with the rest of the rOpenSci packages.

Changed accordingly.

  • Examples: There are many examples, although I noticed a few functions that did not include them.

Done.

  • Package dependencies: As suggested, listed as imports or suggests. One thought I had while installing for the first time was that it seemed a bit heavy (i.e. lots of dependencies). If possible, might be nice to streamline the number of dependencies in future versions. For instance, using wesanderson in the examples is nice, but not necessary as base colors could be used instead.

I've done my best to remove as many dependencies as possible, and believe that it is now down to a minimal necessary number Functions:

  • addaxes(): Examples work fine and use of function is straightforward. Could possibly shorten argument names by dropping "axis", although not necessary.

Changed accordingly.

  • add_colourbar(): Example works fine, but given that you state in the description that the function is intended to be used with add_osm_surface() I would suggest a different example that shows the typical usage. Also, since this is intended to be used with add_osm_surface() perhaps it could be made an internal function and then called via arguments from add_osm_surface()? Lastly, function description and "side" param description should be updated since you have implemented plotting the colorbars along the different sides.

Example updated and greatly extended to hopefully make it far clearer and more relevant. It's still called addcolourbar because I think this is more consistent with the other add functions, but am open to suggestions for re-naming. It can't be made an internal function because to be useful it has to be overlaid only after all map items have been plotted. The new version also allows a lot of control over appearance, and adding all these parameters to add_osm_surface() would make that function a nightmare. The former side parameter has been changed to a binary vertical parameter.

  • add_osm_groups(): Needs examples. Couple of thoughts based on examples in vignette. First, why have the groups specified by points? The way this is getting used suggests polygons would be a better fit. I'd leave it to the user to get their group polygons in line first. This could let you drop some of the other arguments (i.e. the convex hull stuff) and simplify the function a bit. Second, the primary utility of this function is to add parts of the same OSM data. Given that this is a SpatialPolygonsDataFrame, I would think letting the user subset how they prefer and then adding the resultant subset via add_osm_object() would be better. One less function to maintain and learn! Or am I missing some of the primary use case?

Extensive example now given that hopefully clarifies the primary usage, but perhaps more importantly, a huge portion of new vignette ('making-maps-with-data') devoted to this function.

  • add_osm_object(): Examples work. I found the description of the "obj" parameter a bit confusing. It appears that obj can be SpatialPolygonsDataFrame, SpatialPointsDataFrame or SpatialLinesDataFrame but doesn't necessarily need to be a list (which is how I interpreted it) nor returned from extract_osm_objects() although that may be the typical use. Be more explicit in the description of the "obj" param.

I must admit I have not greatly changed the description of obj, because it does have to be an sp data frame object, and I thought, and still suspect, that's a fairly clear description. Any further advice welcome!

  • add_osm_surface(): Example worked, but as noted was slow. See above for suggested changes to the "obj" descriptions".

ggplot2 has made this function enormously faster! It'll still be a bit slow in interpolating complicated data surfaces, but that can't really be avoided. Nevertheless, I think the slowness of the former version would mostly have arisen in the plotting, and that's now orders of magnitude faster.

  • adjust_colours(): Example is perhaps a bit too minimal. Include a full working example (i.e. as in add_osm_surface()). Also provide a link (i.e. with \code{\link[packagename]{functioname}}) to other functions. For this function link to help on col2rgb().

All suggestions implemented. Previous versions did not have \code because Uwe Ligges told me to remove all \code usage for CRAN submission. I've now reinstated many of these, and hope he won't object this time.

  • click_map(): Add example and given that it is interactive wrap it in a \dontrun{}. I'd rethink function name. True you are clicking on the map, but what you are actually doing is getting a convex hull boundary. Maybe get_convex()? Perhaps this could also be an internal function that gets called where it is needed (i.e. add_osm_groups())?

The entire function has simply been ditched, because it was unlikely to have been particularly useful anyway.

  • colour_mat(): Need examples. With these examples, focus on a typical use case.

Extensive examples now provide use case and, hopefully, broader context.

  • connect_highways(), extract_highway(), extract_highways() ...

All purely internal functions now.

  • extract_osm_objects(): Needs examples. Lot of notes included in the comments of the code. Some of these looked like they should be added into the documentation as well (e.g., lines 102-109). Also, the examples in the vignettes show you using this function and extracting $obj from the resultant list. Requiring this further subsetting confuses to usability a bit. I'd prefer to see the warnings handled in a different way or have the other functions cull the $obj out. You could do this by utilizing S3 objects. Lastly, I am not getting any data in the data slot of the returned SPDF and I get the following error: "Error in as.matrix.data.frame(X) : dims [product 13068] do not match the length of object [13079]" when using raster::print (my preferred print method for sp objects) on the vignette example.

Extensive examples now included. Warnings now handled at the point of httr calls, and so function now returns the object directly. The error you got was a briefly-lived error on my part that has long since been resolved. Sorry about that!

  • get_bbox(): No real issues here, although you could easily get away with dropping this function entirely as sp::bbox() or osmar::corner_bbox() do very similar things. I'd just require output of one (or either) of those in the functions that require the bbox instead of rolling your own version.

This is a good suggestion, but not yet implemented because I was focussing on other things (ggplot-ing everyting). The point of get_bbox() is merely the ability to submit a simple vector, which can't be done with sp::bbox. (And I ultimately aim for osmplotr to be independent of osmar.) get_bbox is quite a bit more flexible--and therefore in my mind easier to use--than sp::bbox, but I'll think about dropping it in future versions.

  • highways2polygon(): Description indicates it will return SpatialLines but @return https://github.com/return indicates a data.frame of lat-lon and the function name suggests it will return a polygon. Clean up description and/or @return https://github.com/return. Also think about renaming the function as it doesn't currently return polys. Needs examples.

Renamed connect_highways, because that's really what it does, and examples provided including typical map examples to provide broader context.

  • make_osm_map(): Needs examples. Tried my own and failed. Line 86 (ns <- ...) should be set prior to the if() on line 63. As ns is called within that if statement.

Examples provided. Error fixed - again, please accept my apologies for that one!

  • order_lines()

Internalised.

  • osm_structures(): Example errors ...

Not any more.

  • osmplotr(): Ok, not a function, but I REALLY like this use of the package name to facilitate help. Would be a nice addition to link the function names to the help pages with \code{\link{}}.

See my response above to adjust_colours()

  • plot_osm_basemap(): Example works. Leaving the graphics device open and informing users in the function documentation to close the device doesn't seem like the ideal solution. I'd separate this functionality from plot_osm_basemap() and create another function that saves the current plot to file. Look at dev.copy() or think about using ggplot2 for the viz (see general comments for more on this).

... that last comment was the best of all your suggestions. Thank you for the enormous improvements this has made to almost all functionality of the package! Vignettes:

  • Downloading data ... I think you could do without this vignette.

It's been incorporated within 'making-maps'.

  • Making maps: No problems with this vignette. I think it could possibly be re-named to "Introduction to osmplotr" or something along those lines and have it as the only vignette. Could also wrap in the info currently in the "Downloading Data" vignette at the end of this one if you want the download process for the example data to be captured. You also make a big deal out of being able to highlight selected areas in a different color than the background and that this is not available via other packages. That is not entirely true as you could recreate the plots fairly easily by subsetting areas and plotting those separately with base or ggplot2. I wouldn't tout this as much.

There are now two vignettes, 'making-maps' and 'making-maps-with-data'. The second of these focusses entirely on add_osm_groups() and add_osm_surface(). I hope that this second vignette far more convincingly explains the use of, and advance offered by, these functions. I do not believe that the plots illustrated in this second vignette could be recreated 'fairly easily ... with base or ggplot2'. As evidence I can only say that add_osm_groups() extends over nearly 450 lines of code! I think the ability to easily specify the locations, colours, and other properties of arbitrary numbers of groups is indeed not readily achievable with any other current package. Hopefully the vignettes have greatly clarified typical use and advance offered by osmplotr.

Looking forward to your response to this enormously improved version of osmplotr. Thanks again for the very helpful comments!

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ropensci/onboarding/issues/27#issuecomment-215038880

Jeff W. Hollister email: jeff.w.hollister@gmail.com google voice: 401 326 2531 cell: 401 556 4087

mpadge commented 8 years ago

fyi @jhollist : I uploaded to CRAN and found out that vignette names are cut to some maximal length, so that 'making-maps' and 'making-maps-with-data' ended up with same names. README still links to those CRAN vignettes, but names are now changed in /vignettes.

jhollist commented 8 years ago

General

Thanks for taking the time to address my comments in the first review. I think it is improved over the first draft. That being said, I still see two distinct use cases for your package: the wrapper to the Overpass API and the viz tools for the resultant sp objects. In my opnion, the API part seems closer to being ready. The viz part is not completely intuitive for me. I can see myself utilizing the tools for accessing the OSM data, but I doubt I would find, again for me, the viz tools to be an improvement over what I would do directly with ggplot2.

Fit

After re-reading the fit criteria, I feel osmplotr is a marginal fit for rOpenSci. This does wrap an API, but the Open Street Map components don't necessarily have an obvious scientific application. If I am missing the obvious (quite possible!) what are some research/science applications of the OSM data? osmplotr does also provide a viz component, but does not provide interactivity nor expand much beyond base/ggplot2 tools, in my opinion.

I can see where this might fit with rOpenSci, but also can see where it doesn't. This isn't meant to be a critique per se, just that the fit is not an obvious one for me and I think others should make that call.

ggplo2 Implementation

While I do like the ggplot2 implementation, a couple of issues arose with how it is currently done.

Scaling of the map

The map scaling isn't working correctly, as the plot fills the graphic device instead of being constrained to the ratio of the bounding box. You have examples where you set the width and height of the output image, but that relies on users to set proper height and width. Since this information is part of getting the data in the first place, relying on additional work by the users isn't necessary.

If I were using the package I'd most likely do something along the lines of (with some obvious clean up needed):

library(osmplotr)
library(ggplot2)
library(maptools)

#University of Rhode Island
bbox <- get_bbox(c(-71.54,41.48,-71.51,41.49)) 

dat_B <- extract_osm_objects (key='building', bbox=bbox)
dat_H <- extract_osm_objects (key='highway', bbox=bbox)

#Fortify data for mapping with ggplot2
dat_B_f <- fortify(dat_B, region="id")
dat_H_f <- fortify(dat_H, region="id")

gmap<-ggplot(dat_B_f, aes(long,lat,group=group)) +
  geom_polygon(colour="gray40") +
  geom_line(data=dat_H_f,aes(long,lat),colour='gray70')+
  coord_equal() 
gmap

ggsave("osm_map.png",gmap,width=8,units="in")

Your ggplot code is using coord_cartesian() which is scaling the axes relative to the device size. coord_equal() is forcing the change in y to be the same as the change in x. While technically incorrect for unprojected coordinates, it is a better option, IMO, than having the axes scaled to the device. The change in x and y is close to being equal for most areas mapped by OSM (poles, not so much).

Saving images

ggplot2 has the ggsave() function. Given that the osmplotr maps are ggplot objects, I would use ggsave() instead of the device and dev.off() that you are currently using. Cleaner implementation (and also a personal choice, so take this for what it is worth: just my opinion!)

Vignettes

A couple of thoughts on the vignettes

  1. As you noted, the names on the vignettes are the same. On the next CRAN release, I'd update these. Maybe an "Intro to osmplotr", and "Customizing osmplotr maps" would capture it?
  2. Navigation in Making Maps vignette not working from within RStudio. Web versions do work.
  3. Clean up comments in vignette: eg "# TODO: Set eval=FALSE before cran re-sub!!"

Summary

  1. I think the wrapper to Overpass included in your pacakge is a nice addition and could see myself using it.
  2. The viz tools are nice, but would require additional learning on my part and don't, in my opinion, provide additional tools not already available from ggplot2.
  3. The fit of osmplotr with the ROpenSci suite is marginal. Perhaps a second set of eyes on this would be useful?

If you have any questions or concerns about my review, let me know.

mpadge commented 8 years ago

@sckott Reviews are in from @jhollist, and have helped improve the package considerably and are much appreciated. He nevertheless expressed an overarching concern that he,

can see where this might fit with rOpenSci, but also can see where it doesn't. This isn't meant to be a critique per se, just that the fit is not an obvious one for me and I think others should make that call.

... to which my general response is that I think that the primary advantage of the package lies in the ability to use OSM data themselves for scientific visualisation. This ability--as I believe I have made clear in the second vignette along with the main git osmplotr readme--is both assuredly scientific, and is in my opinion not readily available using any other current package. @jhollist doubts that he,

would find ... the viz tools to be an improvement over what [he] would do directly with ggplot2,

yet I think that osmplotr is indeed an improvement, and can only reiterate in response to his questioning,

what are some research/science applications of the OSM data?

that I believe the combinations of the second vignette, the git osmplotr readme, and the maps included below convincingly demonstrate the direct scientific utility of osmplotr.

I therefore agree with his final comment that,

Perhaps a second set of eyes on this would be useful?

and so, @sckott, let's do this.

A general response

The primary advance of osmplotr lies in the ability to use actual map objects for scientific visualisation, rather than relying on the otherwise two distinct steps of (i) plotting a map and (ii) overlaying data. The functionality for this is manifest through the two functions add_osm_groups() and add_osm_surface(). There is simply no way for such functionality to be readily reproduced using ggplot2. (And, as I emphasised in a previous response, the simpler of these functions runs to over 400 lines.)

I reiterate my firm convictions that:

  1. This ability to visualise user-provided data using actual map objects rather than simply overlaying data on a map is not possible using any other package. This is a definite advance beyond what is possible using ggplot2, and,
  2. This is surely a 'scientific application'

The concerns of @jhollist seem to be that aspects of the underlying functionality can be readily reproduced directly wtih ggplot2, for which he included some illustrative code. However, reproducing even the first, simplest map displayed on the git readme would require considerably more code. A particularly important consideration in my opinion is that it would simply not be possible for anyone not familiar with the idiosyncrasies of ggplot2 to reproduce a map like this, yet osmplotr enables such map production in a mere handful of lines of code, all of which are in my opinion entirely intuitive and--likely of no little importance for general usage--require no familiarity with ggplot2.

Finally, such simple maps are simply an auxiliary utility beside the main functionality of being able to visualise user-provided data using actual map objects rather than just overlaying layers of visually arbitrary data upon pre-generated maps. Maps such as this: this

and this: this

can surely not be readily reproduced using any other package, and would surely be all but impossible for anyone not deeply conversant with ggplot2. They are also direct visualisations of scientific data (in the first case, of categorical clusters, and in the second, of a spatially interpolated continuous surface), and thus surely demonstrate an appropriate fit for ROpenSci?

Technical responses

Once again @jhollist, your comments are perceptive and helpful, and will assuredly be addressed soon, but it's perhaps better for the moment to get that second pair of eyes on the case.

jhollist commented 8 years ago

@mpadge and @sckott

Just to be clear, I do think this is a nice addition (especially the API access). Hopefully my second response didn't come of as overly harsh! And, @mpadge I do take your point that for those wanting to map OSM data in R but who aren't already familiar with ggplot2 this would be an improvement.

Were this a journal review, I think I'd be on the "Accept, with some reservations" side of things.

Again, thanks for you work on this and also for your thoughtful responses!

sckott commented 8 years ago

I'm having a look now

sckott commented 8 years ago

After discussion, we have determined this is a fit for rOpenSci, similar to other packages in our geospatial set - many of which may not be only of use for science, but can be used for science/science-adjacent work.

Do proceed with technical changes...

mpadge commented 8 years ago

Thanks @sckott for the great news! Minor technical changes in response to the concerns of @jhollist are detailed below under his corresponding section titles

ggplot2 Implementation / Scaling of the map

The map scaling isn't working correctly, as the plot fills the graphic device instead of being constrained to the ratio of the bounding box. You have examples where you set the width and height of the output image, but that relies on users to set proper height and width... Your ggplot code is using coord_cartesian() which is scaling the axes relative to the device size. coord_equal() is forcing the change in y to be the same as the change in x. While technically incorrect for unprojected coordinates, it is a better option, IMO, than having the axes scaled to the device.

Previous usage of ccord_cartesian() changed to coord_equal() so devices are scaled to bounding boxes, although I agree with @jhollist that this is not entirely satisfactory. A better option would be ggplot2::coord_map, but this requires an extra dependency (mapproj::mapproject). I've opened an issue to avoid this dependency, and will get to that asap.

Saving images

ggplot2 has the ggsave() function. Given that the osmplotr maps are ggplot objects, I would use ggsave() instead of the device and dev.off() that you are currently using. Cleaner implementation (and also a personal choice, so take this for what it is worth: just my opinion!)

This only affects the first vignette, which I have now rewritten to illustrate the two ways of saving maps to file. I personally prefer direct device calls, because ggsave() does a bunch of seemingly arbitrary re-scaling, and is NOT guaranteed WYSIWYG.

Vignettes

As you noted, the names on the vignettes are the same. On the next CRAN release, I'd update these. Maybe an "Intro to osmplotr", and "Customizing osmplotr maps" would capture it?

New names implemented, but I avoided calling the first one 'Intro' because (i) I would interpret that to indicate that it described the major functionality, and (ii) it would not be clear how it related to the second vignette. I believe that the chosen names ('basic-maps' and 'data-maps') are more indicative both of the content of these two vignettes and of the relationship between them.

Navigation in Making Maps vignette not working from within RStudio. Web versions do work.

There were some inactive links within the documents which are now all active. (There are also links to CRAN versions of the re-named vignettes which will only work after CRAN re-submission.) The TOCs at the beginning of the vignettes work for me within RStudio, so I'm not sure what might have caused that not to work for @jhollist ?

Clean up comments in vignette: eg "# TODO: Set eval=FALSE before cran re-sub!!"

Done - thanks!


Happy to respond to any further suggestions or issues, and even happier to look forward to imminent transfer of osmplotr to ROpenSci. Thanks again @sckott for the support and encouragement!

jhollist commented 8 years ago

@mpadge

Glad that osmplotr fits. After reading your responses and seeing the past changes you made, I think this will be a nice addition. Good luck moving forward with this!

mpadge commented 8 years ago

thanks a lot @jhollist , and thanks for all of the helpful suggestions and advice. osmplotr has been greatly improved thanks to you. i've really appreciated your input.

sckott commented 8 years ago

Great!

[![ropensci\_footer](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org)
mpadge commented 8 years ago

@sckott sorry, just occurred to me days after doing as you had asked that i forgot to inform you: All done! Thanks!

sckott commented 8 years ago

great! closing