Closed flor14 closed 3 months ago
Thank you for your detailed review, @vincentvanhees. I will have a look and reply/create PRs before the third week of January.
Thank you for the thorough feedback, @vincentvanhees !
:calendar: @mpaulacaldas you have 2 days left before the due date for your review (2024-01-16).
Hello @flor14 @emilyriederer! The start of the year has been a bit busier than anticipated (also with an unexpected bout of COVID), so I'll have to take a bit of an extension for my review. You will have it by the Friday 19th at the latest. Wishing you all a nice week, and apologies for the small delay!
Don't worry @mpaulacaldas, I am a bit behind as well with the first changes, so probably I will be finishing them when you submit your comments. I hope you are feeling better!
:calendar: @vincentvanhees you have 2 days left before the due date for your review (2024-01-20).
@vincentvanhees and @emilyriederer
I have been reviewing @vincentvanhees comments, and here are the actions I have taken to address most of them. There is pending section at the end of this issue, as I think that the final details should be addressed after Maria Paula's review to avoid overworking.
The revision has been quite detailed, so I want to express my gratitude again for all the comments provided.
create_your_own_quadkeys.Rmd
, which I changed to from_quadkey_data_to_raster.Rmd
. The only difference between the second and third vignettes is whether the data comes from Facebook or not, but the process is quite similar otherwise. Fernandez-Lab-WSU/quadkeyr@c34a2a61.check()
fail so I replaced numbers with letters). Those changes are here: Fernandez-Lab-WSU/quadkeyr@9c81a05pkgdown
site's first page as the README, as it seems neater. You can see the changes here:Fernandez-Lab-WSU/quadkeyr@662bb73get_grid_from_quadkeys.Rmd
(now called B-from_quadkey_data_to_raster.Rmd
):tileX
and tileY
in the first vignette. I will re-read it at the end of the revision again but the changes are here: Fernandez-Lab-WSU/quadkeyr@b3f1fbcyaml
as you suggested never worked, I suspect that it has to do with the fact that a vignette is a simplified RMarkdown file with fewer functionalities.
The documentation was improved and more images were added Fernandez-Lab-WSU/quadkeyr@b3f1fbccreate_rasters_from_grid.Rmd
(now called C-from_facebook_quadkey_data_to_raster.Rmd
)remotes
in the README
. Fernandez-Lab-WSU/quadkeyr@6716486quadkey_conversion.Rmd
(now called A-bing_map_tile_system_functions.Rmd
):I wrote a note in the pending section to double-check that I have added all the comments for the final review.
To guide the user better about potential performance issues, the app now displays the number of QuadKeys, as you can see in this image:
Also, I have added a specific vignette for the app functionality only: Fernandez-Lab-WSU/quadkeyr@5749abc.
In case it wasn't noticed, there has always been a notification regarding the size of the grid and standard computing times for the user.
I have read about the use of the inst/
folder for Shiny apps in packages for example in this R community thread and this blogspost. I understand that locating the Shiny app in inst/
is a more organized way of structuring a package when it has multiple functions and is not a standalone app.
Also, when you create the tests with shinytest2
in a package structure the documentation guides you in both solutions, using the inst/
folder or not.
usethis
functions, so 2.10 was selected automatically.See pending actions.
pkgcheck
is passingFeel free to reach out if you have any questions or need further clarification.
Thanks @flor14. I will have a closer look when you have also received and processed comments from the María. Regarding the auto-numbering: I have not had this problem myself and doubt vignettes as such prohibit this. See (link to example of my vignettes), maybe you could have a look whether I am doing something very different than what you have been trying.
Oh, I tried something similar and it didn't work. I would double-check it then, thank you @vincentvanhees.
@flor14 @emilyriederer @vincentvanhees - With apologies for the delay, I leave below my review. As you will see, it is extensive, so please feel free to ignore the more minor suggestions. I'm happy to discuss any points that are unclear or help brainstorm solutions, so please don't hesitate!
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 5
Thank you very much for the opportunity to review this package. As someone who has struggled with quadkeys in the past, I think this package addresses a very real gap in the R spatial software space, and I believe it will be useful to many social scientists.
I leave below detailed comments on the different aspects of the package. The review is quite extensive, so please don’t hesitate to ignore the more minor suggestions. Likewise, I’m happy to discuss any points that are unclear or help brainstorm solutions.
This is an excellent README. If I could provide only one minor comment, it would be to put the section “What can this package do for you?” before “What are quadkeys and tile maps?”
Overall, I find the vignettes are informative and easy to understand, with truly excellent pedagogical figures.
I leave below my comments for possible improvements, to complement some of the points raised previously by Vincent:
On C-from_facebook_quadkey_data_to_raster.Rmd
:
apply_weekly_lag()
does not contain an hour
column, that it needs for it to run."../geocovid/data/rasters/city/"
– a relative path only available in the developer’s computer. I see two ways to fix this issue:
inst
and access it using system.file()
or fs::path_package()
, as suggested in the data chapter of the R packages book.sf
object, it is a bit difficult to parse). Explicitly mentioning these numbers may make it easier for readers to understand what you mean by “completing the grid”.inner_join()
will get rid of all the quadkeys from polygrid
that are implicitly missing from files
. Perhaps it would be useful to re-instate that the grid expansion is performed because creating the polygons (and later the rasters) in this implementation requires a full grid.On B-from_facebook_quadkey_data_to_raster.Rmd
:
quadkey_grid_from_bbox.Rmd
for example) or something along those lines. In the end, the main difference between this vignette and C-from_facebook_quadkey_data_to_raster
is that you are showing how to build your own quadkey grid over a given extent, but all other steps (i.e. going from a data set with quadkeys to a raster) are the same. I imagine most users will end up having a workflow like C-from_facebook_quadkey_data_to_raster
, but only few will need to build their own quadkey grid.xmin
, xmax
, ymin
and ymax
refer to the coordinates of the bounding box, following the common notation from sf
. If so, I would advise using the term bounding box explicitly, since it is a term that most spatial R users will be familiar with.slippymath
)On A-bing_map_tile_system_functions.Rmd
:
On D-quadkey_visualization_app.Rmd
:
Lastly, and going slightly against @vincentvanhees suggestion (sorry!) I think you should remove the alphabetical tags from the vignettes names. This is because locally, users access vignettes using vignette("file_name_without_extension")
, and longer file names can be harder to type and remember. As someone having working with quadkeys in the past, the most useful vignette seems to be C-from_facebook_quadkey_data_to_raster
, so if you can, I would prioritise making that specific name shorter, while still informative (not an easy task at all!)
Following on the above, it would be useful to add to the function documentation of apply_weekly_lag()
, a line in the Description clarifying that these are specific to the format of FB mobility data.
Is quadkeyr::clip()
supposed to be exported? Its documentation mentions that this is an internal function and doesn’t provide much detail on functionality. It also masks graphics::clip()
, so if you don’t think it will be much use to end-users, I would suggest to keep it internal.If you want to roxygen to produce a documentation file for the function, while preventing the function from being exported, you can use the @keywords internal
tag instead of @export
.
Similar comment for complete_grid_for_polygons()
. Is this function meant to be exported? Are you exporting it because it may be useful for advanced users? If so, it would be useful to mention it in the function documentation.
The code style is at times inconsistent across the documentation (examples and vignettes). For example, =
and <-
both being used for assignment, sometimes even within the same example (e.g. create_raster()
). Using styler::style_pkg()
to catch these may be helpful.
What is the difference between extract_qk_coord()
and extract_tile_coord()
? It is unclear from the documentation and the examples. Doing a quick test, I also get that both functions return the same result, with the exception that extract_qk_coord()
does not require a level
argument.
grid <- create_qk_grid(
xmin = -59,
xmax = -40,
ymin = -38,
ymax = -20,
level = 6
)
tile_result <- extract_tile_coord(data = grid$data, level = 6)
qk_result <- extract_qk_coord(data = grid$data)
waldo::compare(tile_result, qk_result)
#> ✔ No differences
The example in format_fb_data()
is not reproducible for users and is wrapped in a dontrun{}
. Couldn’t you provide an internal data frame mimicking the format of facebook mobility data? The name fb_mobility_file
that you use in the function documentation sounds like a great name for this example data frame. Alternatively, you could also demo how to download the file from a stable link with download.file()
, saving to a temporary directory, and reading from there.
In ground_res()
, it would be useful to add a link to the Microsoft documentation. Same goes for mapscale()
, mapsize()
.
In latlong_to_quadkey()
, do the returned coordinates correspond to the upper-left corner of the pixel? Is this where the caveat from the vignette applies? If so, it wouldn’t hurt to repeat it again in the function documentation. Same comment goes for pixelXY_to_latlong()
, tileXY_to_pixelXY()
It’s a shame that the example of polygon_to_raster()
is not runnable nor reproducible, as this is the example that best describes that the entire workflow. My comments and suggestions to fix it are the same as mentioned above for the C-from_facebook_quadkey_data_to_raster.Rmd
vignette. These would also apply to the function documentation of read_fb_mobility_fiels()
.
There is some inconsistency in the argument names for quadkey_to_tileXY()
and quadkey_to_langlot()
, despite both taking a character quadkey as primary argument. Some suggestions:
qk
to quadkey
in quadkey_to_tileXY()
, for more consistency with quadkey_to_latlong()
The documentation of quadkey_to_tileXY
says the argument needs to be a character vector, but in the example, you use a numeric vector and rely on type coercion. I suggest fixing the example to use an argument, but also making quadkey_to_tileXY
fail explicitly if the input type is not character.
As mentioned above, apply_weekly_lag()
is currently failing for the C-from_facebook_quadkey_data_to_raster.Rmd
vignette. When you implement a fix, it would perhaps be a good idea to add a unit test checking that the function works for the data used in the vignette.
The example provided in complete_grid_polygons()
returns a data frame where the quadkey column is filled with NA
, which does not seem like the intended behaviour. If you fix this, I suggest adding a unit-test based on the example. It should be easy since your example is already in a compact-enough format.
When working though the common use-case of creating a raster from a vector of quadkeys (see reprex below), I noticed two potential improvements for grid_to_polygon()
and regular_qk_grid()
:
grid_to_polygon()
fails unless the input data set contains the variables tileX
, tileY
, quadkey
and geometry
in that specific order. I would suggest mentioning this in the function documentation and to provide a more informative error message.regular_qk_grid()
should warn when the provided grid is already regular, instead of throwing an error. This would make piping quadkeyr
functions easier for users.These comments touch more on the API design choices, which fall a bit out-of-scope of the reviewing guidelines from rOpenSci. I raise them here in case you find them useful, but I completely understand if you prefer not to implement them, since they can be time-consuming :smile:
From the user perspective, I find it a bit difficult to differentiate between the functions that are specific to the FB mobility data (e.g. apply_weekly_lag()
, missing_combinations()
) and those that are more general and could be used in other applications (e.g. create_quadkey_grid()
). Perhaps a way to remove some of this confusion could be to create different sections in the pkgdown website, bundling together functions that are specific to FB mobility data and those that are more general. usethis uses a similar approach, see for example their function reference and how they define the categories via the _pkgdown.yml
. If you want to take it a step further, it could also be useful to use a common prefix for FB-specific operations (e.g. fb_apply_weekly_lag()
, fb_format_data()
).
I find the name of create_raster()
confusing. As a spatial R user, the name create_raster()
leads me to believe that the function returns a raster
Raster*
object, while in reality it returns a stars
object. In my opinion, a slightly different name (e.g. create_stars()
or
create_stars_raster()
) would be slightly less confusing.
Similarly, when I read extract_qk_coord()
or extract_tile_coord()
, I expect a function that behaves similarly to terra::extract()
/raster::extract()
(i.e. which extract the value of a raster based on a polygon or point; functions that take a raster and a polygon/point object as arguments). However, extract_qk_coord()
and extract_tile_coord()
are not doing spatial extractions. Perhaps a name like get_qk_coord()
would be better.
Thank you @mpaulacaldas for your review!
All the comments are relevant, and I have gained new insights after using the package myself. I will probably include some minor changes on my side during the revision. I will open issues so everything is clear.
My goal is to finish this by the end of January.
I am really happy to have such good reviewers.
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/619#issuecomment-1875919666 time 3
Logged review for vincentvanhees (hours: 3)
@ropensci-review-bot submit review https://github.com/ropensci/software-review/issues/619#issuecomment-1905986027 time 5
Logged review for mpaulacaldas (hours: 5)
@mpaulacaldas and @vincentvanhees - thank you both so much for your time and in-depth, thoughtful reviews! I really appreciate all your time and work on this.
@flor14: please post your response with @ropensci-review-bot submit response <url to issue comment>
if you haven't done so already (this is an automatic reminder).
Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html
Update: I am moving forward with this. I am planning some changes in how the documentation is organized, so it is going to take me a bit longer than I expected originally.
@flor14 - that sounds great! No pressure on the timing; the autoreminders are great to bump the thread in cases when nothing is happening, but definitely appreciate all of the updates your making and don't want to rush things
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: 9b7b8c43
Package License: MIT + file LICENSE
The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.
|type |package | ncalls|
|:----------|:-------------|------:|
|internal |base | 135|
|internal |quadkeyr | 65|
|internal |utils | 40|
|internal |graphics | 6|
|internal |stats | 1|
|imports |dplyr | 11|
|imports |sf | 7|
|imports |stars | 3|
|imports |purrr | 1|
|imports |lubridate | NA|
|imports |readr | NA|
|imports |rlang | NA|
|imports |shiny | NA|
|suggests |bslib | NA|
|suggests |DT | NA|
|suggests |ggplot2 | NA|
|suggests |knitr | NA|
|suggests |leaflet | NA|
|suggests |rmarkdown | NA|
|suggests |rnaturalearth | NA|
|suggests |shinytest2 | NA|
|suggests |testthat | NA|
|suggests |tidyr | NA|
|suggests |viridis | NA|
|suggests |withr | NA|
|linking_to |NA | NA|
Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(
list (17), return (17), c (15), data.frame (9), for (9), min (5), seq (5), nrow (4), rbind (4), seq_len (4), expand.grid (3), nchar (3), pi (3), unique (3), abs (2), as.integer (2), by (2), file (2), floor (2), max (2), sign (2), sum (2), as.character (1), as.Date (1), as.numeric (1), atan (1), character (1), colnames (1), exp (1), is.na (1), length (1), list.files (1), log (1), paste0 (1), rev (1), seq_along (1), sin (1), strsplit (1), subset (1), system.file (1)
clip (7), mapsize (7), quadkey_to_tileXY (7), pixelXY_to_latlong (4), missing_combinations (3), pixelXY_to_tileXY (3), quadkey_to_latlong (3), tileXY_to_pixelXY (3), complete_grid_for_polygons (2), create_qk_grid (2), create_stars_raster (2), get_qk_coord (2), grid_to_polygon (2), latlong_to_pixelXY (2), quadkey_to_polygon (2), regular_qk_grid (2), add_regular_polygon_grid (1), apply_weekly_lag (1), format_fb_data (1), get_regular_polygon_grid (1), get_tile_coord (1), ground_res (1), latlong_to_quadkey (1), mapscale (1), polygon_to_raster (1), qkmap_app (1), quadkey_df_to_polygon (1), tileXY_to_quadkey (1)
data (40)
mutate (7), anti_join (2), all_of (1), lag (1)
st_as_sf (4), st_bbox (1), st_sf (1), st_sfc (1)
grid (3), polygon (3)
st_as_stars (1), st_rasterize (1), write_stars (1)
map_dfr (1)
var (1)
base
quadkeyr
utils
dplyr
sf
graphics
stars
purrr
stats
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 15 files) and - 1 authors - 4 vignettes - 2 internal data files - 8 imported packages - 29 exported functions (median 15 lines of code) - 29 non-exported functions in R (median 26 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by [the `checks_to_markdown()` function](https://docs.ropensci.org/pkgcheck/reference/checks_to_markdown.html) The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:------------------------|-------:|----------:|:----------| |files_R | 15| 73.0| | |files_vignettes | 4| 95.3| | |files_tests | 14| 93.8| | |loc_R | 624| 54.5| | |loc_vignettes | 907| 89.4| | |loc_tests | 802| 83.5| | |num_vignettes | 4| 96.6|TRUE | |data_size_total | 2037907| 97.7|TRUE | |data_size_median | 1018953| 98.5|TRUE | |n_fns_r | 58| 61.1| | |n_fns_r_exported | 29| 77.4| | |n_fns_r_not_exported | 29| 51.8| | |n_fns_per_file_r | 2| 40.8| | |num_params_per_fn | 2| 10.4| | |loc_per_fn_r | 18| 54.1| | |loc_per_fn_r_exp | 15| 35.6| | |loc_per_fn_r_not_exp | 26| 73.5| | |rel_whitespace_R | 30| 69.7| | |rel_whitespace_vignettes | 26| 87.8| | |rel_whitespace_tests | 17| 78.2| | |doclines_per_fn_exp | 32| 36.7| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 54| 68.0| | ---
Click to see the interactive network visualisation of calls between objects in package
goodpractice
and other checks#### 3a. Continuous Integration Badges [![test-coverage.yaml](https://github.com/Fernandez-Lab-WSU/quadkeyr/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/Fernandez-Lab-WSU/quadkeyr/actions) [![check-standard.yaml](https://github.com/Fernandez-Lab-WSU/quadkeyr/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/Fernandez-Lab-WSU/quadkeyr/actions) [![pkgcheck](https://github.com/Fernandez-Lab-WSU/quadkeyr/workflows/pkgcheck/badge.svg)](https://github.com/Fernandez-Lab-WSU/quadkeyr/actions) **GitHub Workflow Results** | id|name |conclusion |sha | run_number|date | |----------:|:--------------------------|:----------|:------|----------:|:----------| | 7895702388|pages build and deployment |success |80d094 | 34|2024-02-14 | | 7895653165|pkgcheck |success |9b7b8c | 15|2024-02-14 | | 7895653161|pkgdown |success |9b7b8c | 38|2024-02-14 | | 7895653168|R-CMD-check |success |9b7b8c | 38|2024-02-14 | | 7895653170|test-coverage |success |9b7b8c | 38|2024-02-14 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking installed package size ... NOTE installed size is 11.8Mb sub-directories of 1Mb or more: data 2.3Mb doc 7.7Mb R CMD check generated the following check_fail: 1. rcmdcheck_reasonable_installed_size #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 95.66 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) No functions have cyclocomplexity >= 15 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 17 potential issues: message | number of times --- | --- Avoid library() and require() calls in packages | 17
|package |version | |:--------|:--------| |pkgstats |0.1.3.11 | |pkgcheck |0.1.2.15 |
This package is in top shape and may be passed on to a handling editor
Hello, Before addressing the last review, I want to make some general updates regarding the state of the package. NOTE: When referencing changes made to items such as vignettes, that have received multiple changes in the text, I will link the last commit for easier identification.
_pkgdown.yml
, so they appeared in a random order. Now, I have simplified and organized them as follows:
quadkey_to_sf_conversion
quadkey_identified_data_to_raster
facebook_mobility_csvs_to_raster_files
quadkey_visualization_app
I believe these names are much more representative of the content of the vignettes. You can find the last version of the vignettes here: https://github.com/Fernandez-Lab-WSU/quadkeyr/tree/9b7b8c438933022ed81fc1f102e162bff556917a/vignettes
YAML
you sent me, but for some reason, it was not working for me. After some reading, this worked:
---
title: "Generating a Raster Image from Quadkey-Identified Data"
output:
html_document:
toc : true
number_sections: true
toc_depth: 3
pkgdown:
as_is: true
vignette: >
%\VignetteIndexEntry{Generating a Raster Image from Quadkey-Identified Data}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
You can find the last version of the vignettes here: https://github.com/Fernandez-Lab-WSU/quadkeyr/tree/9b7b8c438933022ed81fc1f102e162bff556917a/vignettes
The major change was made in response to Maria Paula’s comment about adding a basic workflow. In particular, I think the major improvement is in the second vignette (quadkey_identified_data_to_raster
):
sf
) and stars
packages and the homogenization of the terminology. For example, to refer to a sf
class data frame we call it sf
POLYGON data.frame or sf
POINT data.frame.inst/extdata
.
Fernandez-Lab-WSU/quadkeyr@7d5d3f9cquadkey_to_polygon()
and quadkey_df_to_polygon()
. One takes a string as an argument, while the other takes a data.frame. Both functions are included in a new file named qk_to_polygon_functions.R
. Fernandez-Lab-WSU/quadkeyr@8c7944a8R/regular_qk_grid.R
that wrap the three major functions of the workflows described in quadkey_identified_data_to_raster
(add_regular_polygon_grid()
) and Facebook_mobility_csvs_to_raster_files.
(get_regular_polygon_grid()
) . Fernandez-Lab-WSU/quadkeyr@3e354aa6polygon_to_raster()
are now available. Fernandez-Lab-WSU/quadkeyr@9eac5fbdadd_regular_polygon_grid()
, the function regular_qk_grid()
can now add rows to a data.frame when adding the extra grid while keeping other variables of interest.create-raster()
to create_stars_raster()
Fernandez-Lab-WSU/quadkeyr@f105177cget_qk_tile()
is now wrapped inside get_qk_coord()
, facilitating its use for other internal functions such as regular_qk_grid()
. Fernandez-Lab-WSU/quadkeyr@605476993create_qk_grid()
. Fernandez-Lab-WSU/quadkeyr@a7a5f32etime
to hour
has been completed in all the related functions and datasets. Fernandez-Lab-WSU/quadkeyr@4bd5cc95quadkey_to_polygon()
now it is possible to visualize the QuadKey as a point and a polygon in the app. f0ba3adReview Comments Thank you very much for the opportunity to review this package. As someone who has struggled with QuadKeys in the past, I think this package addresses a very real gap in the R spatial software space, and I believe it will be useful to many social scientists. I leave below detailed comments on the different aspects of the package. The review is quite extensive, so please don’t hesitate to ignore the more minor suggestions. Likewise, I’m happy to discuss any points that are unclear or help brainstorm solutions.
This is an excellent README. If I could provide only one minor comment, it would be to put the section “What can this package do for you?” before “What are quadkeys and tile maps?”
This one is easy to see in the README: https://github.com/Fernandez-Lab-WSU/quadkeyr Fernandez-Lab-WSU/quadkeyr@6e003884
Overall, I find the vignettes are informative and easy to understand, with truly excellent pedagogical figures. I leave below my comments for possible improvements, to complement some of the points raised previously by Vincent:
On C-from_facebook_quadkey_data_to_raster.Rmd:
This issue has been resolved; some sections were not updated despite not giving errors. I missed the command that allows me to change all the variables in a project, which I have used with other editors. Next time, I will use regex. I am trying to make small commits for easier review, so this might not have helped either, I was expecting you to use the same commit as Vincent. This is the function now and here is one of the commits I used to fix it Fernandez-Lab-WSU/quadkeyr@4bd5cc9.
That is the new name of the vignette now, as you can see in the vignette: https://fernandez-lab-wsu.github.io/quadkeyr/articles/quadkey_identified_data_to_raster.html This is the vignette in the present commit
Thank you for your detailed suggestions, I agree. All the examples of the vignettes are reproducible now, as there have been added example files in
inst/extdata
. c27e452
In steps 2 and 3, it would be useful to explicitly mention in the text that the number of quadkeys went from 150 to 396 (though the info is available in the printed sf object, it is a bit difficult to parse). Explicitly mentioning these numbers may make it easier for readers to understand what you mean by “completing the grid”.
Done in 6e00388
Seeing step 4, it is unclear to me why we bother expanding the grid, if in the end the inner_join() will get rid of all the quadkeys from polygrid that are implicitly missing from files. Perhaps it would be useful to re-instate that the grid expansion is performed because creating the polygons (and later the rasters) in this implementation requires a full grid.
There are 3 reasons for this approach:
- Flexibility with Facebook Mobility Datasets: When receiving a set of Facebook mobility datasets, it's uncertain whether exactly the same QuadKeys will be reported in all of them. To address this, I detect all the QuadKeys reported after reading all the Facebook data frames and convert them to polygons, creating a data.frame containing only the columns
quadkey
andgeometry
that you can later join with the Facebook mobility data. This approach not only streamlines the process for the current datasets but also prepares to process potential future files with unknown QuadKeys for the studied location. Alternatively, users can create the QuadKey polygon grid using the bounding box of the area of interest and then join the data. This option is facilitated by the function create_qk_grid(), and both approaches are now documented in the vignette:- Computational cost - Also, consider the computational cost associated with calculating polygons for each file, especially when many of the files share almost identical QuadKeys. By aggregating and storing the QuadKeys separately, we avoid redundant calculations for overlapping QuadKeys across multiple Facebook files.
- Grid Creation for Raster: The third reason is the necessity of the grid for raster operations, as you mentioned. I've also introduced the function
quadkey_df_to_polygon()
, which allows conversion without completing the grid. I explicitly mention this difference to the user to ensure clarity on the approach being used. You can read the improvements in the new versions of the vignettes that correspond to this commit
The resolution of the last image is off in my display using the RStudio viewer.
I am not very sure about which one is the last image, but I have improved how all the images are rendered, so I hope it is solved.
On B-from_facebook_quadkey_data_to_raster.Rmd:
This vignette suffers the biggest changes. It is what I am saying in the previous point, you can use the bounding box or the list of QuadKeys to calculate this. You can read the improvements in the new versions of the vignettes that correspond to this commit
I tried to keep it exactly as appears in the documentation of Microsoft Bing Tile Maps, but I agree that the connection with other packages is desirable. Done in this commit and I also added a note in the first vignette about both terms being identical.
On A-bing_map_tile_system_functions.Rmd:
This is a minor note, but I think it’s easier (and still correct) to say “quadkey” instead of “quadkey number”.
Bravo for including the caveat in Section 3. Perhaps a more eye-catching title would help users understand the importance of reading this section (e.g. “Caveats when converting coordinates”)
Done, and also I included a call in Roxygen to the vignette. You can read the improvements in the new versions of that vignette that correspond to this commit
On D-quadkey_visualization_app.Rmd: It’s a nice and valuable addition to have a vignette introducing the Shiny app, with screenshots showing common behaviors. I will leave here some minor improvements I think you could make to the app:
I think it would be useful if the Quadkey tab, showed the boundaries of the user-specified quadkey in addition to only the upper-left corner. As an analyst, I often want to visualize the area covered by a specific quadkey more than the coordinates of its extreme points. If you introduce these changes, I think a better name for the tab could be “Visualising a quadkey”.
I think the Grid tab could have a better name, so users know that its goal is to produce a quadkey grid from a set of coordinates and a zoom level. I would suggest “Creating a quadkey grid”.
App changes in Fernandez-Lab-WSU/quadkeyr@f0ba3ad You can try the app in the last commit with all the changes.
Lastly, and going slightly against @vincentvanhees suggestion (sorry!) I think you should remove the alphabetical tags from the vignettes names. This is because locally, users access vignettes using vignette("file_name_without_extension"), and longer file names can be harder to type and remember. As someone having working with quadkeys in the past, the most useful vignette seems to be C-from_facebook_quadkey_data_to_raster, so if you can, I would prioritise making that specific name shorter, while still informative (not an easy task at all!)
I explained this at the beginning. I uodated the
_pakgdown.yaml
in Fernandez-Lab-WSU/quadkeyr@7d5d3f9
Following on the above, it would be useful to add to the function documentation of apply_weekly_lag(), a line in the Description clarifying that these are specific to the format of FB mobility data.
Is quadkeyr::clip() supposed to be exported? Its documentation mentions that this is an internal function and doesn’t provide much detail on functionality. It also masks graphics::clip(), so if you don’t think it will be much use to end-users, I would suggest to keep it internal.If you want to roxygen to produce a documentation file for the function, while preventing the function from being exported, you can use the @keywords internal tag instead of @export.
Yes, I thought about this. Most of the functions in
map_functions.R
could be internal. As I wasn’t sure how potential users could use the package I decided to make available all the functions. However, it is true that among the functions inmap_functions.R
,clip()
is the most difficult to imagine and use in isolation and the masking is giving a justification to make this one internal. Done in Fernandez-Lab-WSU/quadkeyr@c8f97282d
Similar comment for complete_grid_for_polygons()
. Is this function meant to be exported? Are you exporting it because it may be useful for advanced users? If so, it would be useful to mention it in the function documentation.
I agree, this is pending.
The code style is at times inconsistent across the documentation (examples and vignettes). For example, = and <- both being used for assignment, sometimes even within the same example (e.g. create_raster()). Using styler::style_pkg() to catch these may be helpful.
Done in Fernandez-Lab-WSU/quadkeyr@f105177c. This also fixes most of the space issues that @vincentvanhees mentioned.
What is the difference between extract_qk_coord() and extract_tile_coord()? It is unclear from the documentation and the examples. Doing a quick test, I also get that both functions return the same result, with the exception that extract_qk_coord() does not require a level argument.
grid <- create_qk_grid(
xmin = -59,
xmax = -40,
ymin = -38,
ymax = -20,
level = 6
)
tile_result <- extract_tile_coord(data = grid$data, level = 6)
qk_result <- extract_qk_coord(data = grid$data)
waldo::compare(tile_result, qk_result)
#> ✔ No differences
Yes, the difference between these two functions is the input, not the output. Instead of using the QuadKey to get the geographic coordinates, you can use the Tile coordinates. One of the QuadKeys’ properties is that the length of the string provides information about the level of detail, that is why you need to specify the level when starting the conversion from the Tile coordinates. What I think you find weird about this is that I don’t need to define in detail both functions, I can call
get_tile_coord()
insideget_qk_coord()
. That is the change I applied in Fernandez-Lab-WSU/quadkeyr@6054769
The example in format_fb_data() is not reproducible for users and is wrapped in a dontrun{}. Couldn’t you provide an internal data frame mimicking the format of facebook mobility data? The name fb_mobility_file that you use in the function documentation sounds like a great name for this example data frame. Alternatively, you could also demo how to download the file from a stable link with download.file(), saving to a temporary directory, and reading from there.
Done in Fernandez-Lab-WSU/quadkeyr@63c8932
In ground_res(), it would be useful to add a link to the Microsoft documentation. Same goes for mapscale(), mapsize().
Done in Fernandez-Lab-WSU/quadkeyr@2772dd9
In latlong_to_quadkey(), do the returned coordinates correspond to the upper-left corner of the pixel? Is this where the caveat from the vignette applies? If so, it wouldn’t hurt to repeat it again in the function documentation. Same comment goes for pixelXY_to_latlong(), tileXY_to_pixelXY()
Done in Fernandez-Lab-WSU/quadkeyr@157701dba0
It’s a shame that the example of polygon_to_raster() is not runnable nor reproducible, as this is the example that best describes that the entire workflow. My comments and suggestions to fix it are the same as mentioned above for the C-from_facebook_quadkey_data_to_raster.Rmd vignette. These would also apply to the function documentation of read_fb_mobility_fiels().
Done in Fernandez-Lab-WSU/quadkeyr@c27e452
There is some inconsistency in the argument names for quadkey_to_tileXY() and quadkey_to_langlot(), despite both taking a character quadkey as primary argument. Some suggestions:
Making both functions vectorised, so they can each accept a quadkey character vector of any length, OR… Keep the current behavior, but change the argument name of qk to quadkey in quadkey_to_tileXY(), for more consistency with quadkey_to_latlong()
Done but let me think again about it. 🤔
The documentation of quadkey_to_tileXY says the argument needs to be a character vector, but in the example, you use a numeric vector and rely on type coercion. I suggest fixing the example to use an argument, but also making quadkey_to_tileXY fail explicitly if the input type is not character.
Done in Fernandez-Lab-WSU/quadkeyr@f4ecb2ddb and also mentioned as an issue in the quadkeyr repo Fernandez-Lab-WSU/quadkeyr#2
As mentioned above, apply_weekly_lag() is currently failing for the C-from_facebook_quadkey_data_to_raster.Rmd vignette. When you implement a fix, it would perhaps be a good idea to add a unit test checking that the function works for the data used in the vignette.
The problem was really that the data that that function was reading was incorrect, that was why it didn't failed.
The example provided in complete_grid_polygons() returns a data frame where the quadkey column is filled with NA, which does not seem like the intended behaviour. If you fix this, I suggest adding a unit-test based on the example. It should be easy since your example is already in a compact-enough format.
No. By intentionally retaining the NA values in those QuadKeys, you're indicating that these specific QuadKeys serve a purpose in completing polygons but are not considered part of the grid. After calculating the polygons I subset the data and remove them. I added comments to the function documentation to clarify this.
When working though the common use-case of creating a raster from a vector of quadkeys (see reprex below), I noticed two potential improvements for grid_to_polygon() and regular_qk_grid():
grid_to_polygon() fails unless the input data set contains the variables tileX, tileY, quadkey and geometry in that specific order. I would suggest mentioning this in the function documentation and to provide a more informative error message.
For some reason I can’t reproduce this message, the functions work well for me even if I change the order of the columns. Could you share the error message so I can have an idea of what is going on?
regular_qk_grid() should warn when the provided grid is already regular, instead of throwing an error. This would make piping quadkeyr functions easier for users.
Done in Fernandez-Lab-WSU/quadkeyr@a6bdff90
These comments touch more on the API design choices, which fall a bit out-of-scope of the reviewing guidelines from rOpenSci. I raise them here in case you find them useful, but I completely understand if you prefer not to implement them, since they can be time-consuming 😄 From the user perspective, I find it a bit difficult to differentiate between the functions that are specific to the FB mobility data (e.g. apply_weekly_lag(), missing_combinations()) and those that are more general and could be used in other applications (e.g. create_quadkey_grid()). Perhaps a way to remove some of this confusion could be to create different sections in the pkgdown website, bundling together functions that are specific to FB mobility data and those that are more general. usethis uses a similar approach, see for example their function reference and how they define the categories via the _pkgdown.yml. If you want to take it a step further, it could also be useful to use a common prefix for FB-specific operations (e.g. fb_apply_weekly_lag(), fb_format_data()).
This is a great comment!
I find the name of create_raster() confusing. As a spatial R user, the name create_raster() leads me to believe that the function returns a raster Raster* object, while in reality it returns a stars object. In my opinion, a slightly different name (e.g. create_stars() or create_stars_raster()) would be slightly less confusing.
Done in Fernandez-Lab-WSU/quadkeyr@75a1c76aa
Similarly, when I read extract_qk_coord() or extract_tile_coord(), I expect a function that behaves similarly to terra::extract()/raster::extract() (i.e. which extract the value of a raster based on a polygon or point; functions that take a raster and a polygon/point object as arguments). However, extract_qk_coord() and extract_tile_coord() are not doing spatial extractions. Perhaps a name like get_qk_coord() would be better.
Done in Fernandez-Lab-WSU/quadkeyr@f5966b66
@emilyriederer and reviewers
There are 4 comments and 1 bug on the last revision that I have pending, but there are minor ones:
1 - Make the function complete_grid_for_polygon()
internal
2 - Split the reference section regarding what the functions are for.
3 - I want to see if the argument names of quadkey_* functions still make sense after all the changes.
4- Fix a bug in one of the vignettes: the hour
column appears as NA by a parsing error in read.csv
.
I have to leave this for some days now, so I decided to share my comments anyway. I will open issues with these pending items and post them when finished.
Hi @flor14 - thank you so much for the hard work and all of your updates! Your dedication to the work and all of the new developments are really impressive to see. Please keep us posted whenever you prefer the reviewers to take a final look
Vincent, Maria Paula, and @emilyriederer,
First of all, thank you for your patience. Here are the last modifications, the revision is now complete on my end:
I have added a references
section to the _pkgdown.yaml
, using the same names as the vignettes for the titles. All the functions that are not listed here because they weren't explicitly mentioned in the vignettes, have been classified as internal including the datasets and get_tile_coord()
.
This commit addresses @mpaulacaldas comment about the references section and also clarifies what functions are available for their use.
👉🏼 Fernandez-Lab-WSU/quadkeyr@c9fffb789
I kept thinking about the quadkey_*
functions' arguments and found it challenging to get an illustrative argument name for a function that can accept strings or vectors, in comparison to the ones that only accept strings.
I have implemented two modifications regarding this:
1 - I have created an image illustrating this concept and added it to the vignette. Aditionally, I have created it for latlong_to_quadkey()
as well:
2 - I have updated the argument names for quadkey_to_latlong()
to quadkey_data
and keep quadkey
for quadkey_to_tileXY()
.
👉🏼Fernandez-Lab-WSU/quadkeyr@7d0a9da17
Finally, I have run again stylr_pkg()
after most of this changes 👉🏼 Fernandez-Lab-WSU/quadkeyr@0df053ebb
Last but not least, I would like to thank you for the detailed review once again. When I began this particular part of the project in August 2023, I was very new to working with QuadKeys. It has been a very ambitious package from the start, but as the intention is to continue working with this data I know that all of this hard work will pay off. I am quite happy with the results, as we now have a package to condense all that I've learned from my research and from you to share with the rest of the community.
Question: Would it be possible to leave a version of this package in Spanish (I mean, not to have the review in Spanish again, if not leave a version of it in Spanish)? I am not sure if that is possible or if you have any comments/examples about other developers doing something like that.
Thank you! And just in case @ropensci-review-bot check package
Thank you @flor14 ! Your dedication to this ambitious project is very impressive and its great to see all these updates. I know we are actively reviewing packages now in multiple languages, so I will check in with the editorial team regarding best practices for maintaing versions of both.
@mpaulacaldas and @vincentvanhees - can you please take a pass through all these wonderful updates and confirm your approval with this template?
Hi there @mpaulacaldas and @vincentvanhees - just a reminder, at your convenience, could you please review @flor14 's updates and confirm your approval with this template? Per our reviewer guide, the goal of this stage is to ensure you feel your comments have been sufficiently addressed
Sorry, I have been crazily occupied, it is on my to do list for the weekend.
For the past few days, I have been using quadkeyr
and I have added improvements in apply_weekly_lag()
, read_fb_mobility_files()
, and format_fb_data()
along with their tests.
apply_weekly_lag()
was removed. Reducing the amount of loops was one of Vincent's comments.read_fb_mobility_files()
and format_fb_data()
now have an additional argument, making them more flexible for working with real data.
All these changes are available in the read-and-7-day-lag
branch: https://github.com/Fernandez-Lab-WSU/quadkeyr/commits/read-and-7-day-lag/I will not merge these changes because I believe the review has already been addressed, and I don't want to create confusion. However, if after reading the review you still have concerns about these specific functions, you can check that branch.
Thank you.
Great job @flor14! My comments have been addressed satisfactory.
A few remarks for your consideration:
Going back to our correspondence about the html_document
versus html_vignette
: I learnt last week that what I suggested (html_document) is less lightweight than (html_vignette) according to bookdown documentation. So, if this is a concern when the package grows in the future then you may want to reverse to html_vignette
. For now I guess it is not a problem.
The DESCRIPTION file lists you as author and creator but does not list copyright holders (cph
). If this is a hobby project then it is also you, but if someone paid for your time then usually they are the copyright holder. If applicable, listing them gives credit for their support and is helpful to know for potential contributors.
I am not familiar with rOpenSci's release process, but if the acceptance of this package does not generate a tagged release I would recommend creating one in the near future, such that you have a clear reference point for this stage of the development. Don't forget to also describe this as your first release in NEWS.md.
Estimated hours spent reviewing: 1
Thank you @flor14 for all of the detailed and careful changes you made in this last round. With these, I believe you have addressed my key comments in a satisfactory way, and I am happy for to recommend approval.
Like Vincent, I leave here a couple of remarks for future consideration:
I think you may have some confusion on the use of @export
and the @keywords internal
tag. In general, you do not use @keywords internal
for functions that you export with @export
, but I see for example that you are doing so for get_tile_coord()
and complete_grid_for_polygons()
. If you don't intend to export these functions (in other words, you want them to be internal), then you should remove the @export
tag. If on the contrary, you want to export the functions, then you don't need the @keywords internal
tag, since Roxygen builds by default documentation for all exported functions. Right now, you are in strange situation where those two functions are exported, but their documentation is hidden in the pkgdown website.
A minor bug, but the images in your README are not showing up in the pkgdown website.
Estimated hours spent reviewing: 1
Thank you @mpaulacaldas and @vincentvanhees
I will change what you are saying @mpaulacaldas tomorrow morning. The @export
tag was removed in the other internal functions, probably I missed removing them in the last review. Thank you for pointing this out.
Thank you, @mpaulacaldas and @vincentvanhees for your continued, thoughtful engagement! We really appreciate your support
@flor14 - please give me a tag after you've made the last tweaks based on @mpaulacaldas 's feedback, and I think we will be ready to approve quadkeyr
!
I am on hold with this because the researcher I am working with is asking the university about the copyright. Also, I will ask a question in ROpenSci Slack regarding how to specify the authorship, in case you have an opinion about it.
My understanding is that if you work for an institute then usually they are the copyright holder and not the external funder that funded the research.
Also, I will ask a question in ROpenSci Slack regarding how to specify the authorship, in case you have an opinion about it.
You can just do:
Authors@R:
c(person(given = "Florencia",
family = "D'Andrea",
email = "florencia.dandrea@gmail.com",
role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0041-097X")),
person("Your Institute Name", role="cph"))
EDIT: I now saw the question you posted which is a of a different nature. Agree with Noam that minor contributors do not need to be code contributions, and auth should be reserved for those who made significant contributions.
Thank you @vincentvanhees, you are always ready to answer. The researcher told me that she wanted to ask someone from the university about this, so I will wait anyway. I don't work for any institution, this is just a contract.
The review is finished, I have updated the authors, theNEWS.md
and created a release.
Remaining questions:
Thank you
thanks, there is a typo in my name, it is "van Hees" not "Van Hess".
Oh, I will fix it before the end of the day.
On Thu, Mar 14, 2024 at 12:06 AM Vincent van Hees @.***> wrote:
thanks, there is a typo in my name is "van Hees" not "Van Hess".
— Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/619#issuecomment-1996699081, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHB5G2W4SXN3NAAOJZNWDDDYYFD6TAVCNFSM6AAAAABAIGPWTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJWGY4TSMBYGE . You are receiving this because you were mentioned.Message ID: @.***>
Done! I modified the release so there is not a version with your name incorrect.
Congratulations @flor14 ! Thank you so much for your dedication to this package
@ropensci-review-bot approve quadkeyr
Approved! Thanks @flor14 for submitting and @mpaulacaldas, @vincentvanhees for your reviews! :grin:
To-dos:
@ropensci-review-bot invite me to ropensci/<package-name>
which will re-send an invitation.@ropensci-review-bot finalize transfer of <package-name>
where <package-name>
is the repo/package name. This will give you admin access back.pkgdown
website and are ok relying only on rOpenSci central docs building and branding,
pkgdown
website with a redirecting pagehttps://docs.ropensci.org/package_name
URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
codemetar::write_codemeta()
in the root of your package.install.packages("<package-name>", repos = "https://ropensci.r-universe.dev")
thanks to R-universe.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent).
Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.
We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.
Last but not least, you can volunteer as a reviewer via filling a short form.
A few additional notes @flor14 :
pkgdown
. Meanwhile this project is in development to do something similar for help pages, but to my understanding it's not a stable release yetI will jump here to add that we are discussing multilingual documentation and publishing and sharing projects, ideas, and progress in the #multilingual channel in our Slack. 😃
@ropensci-review-bot finalize transfer of quadkeyr
Transfer completed.
The quadkeyr
team is now owner of the repository and the author has been invited to the team
Date accepted: 2024-03-14
Submitting Author Name: Florencia D'Andrea Submitting Author Github Handle: !--author1-->@flor14<!--end-author1-- Repository: https://github.com/Fernandez-Lab-WSU/quadkeyr Version submitted: Submission type: Standard Editor: !--editor-->@emilyriederer<!--end-editor-- Reviewers: @mpaulacaldas, @vincentvanhees
Archive: TBD Version accepted: TBD Language: en
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
🗺️ The package offers tools to analyze data reported by QuadKey and convert it to raster images. QuadKeys are strings that encode information about map coordinates and the level of detail of the Bing Maps Tile System.
Anyone trying to analyze data reported in this format. Facebook mobility data, for example, can be reported by QuadKey. This package could help this StackOverflow user or this other one.
The closer R package dealing with this type of data is
slippymath
which has a more general objective.quadkeyr
is only based on Microsoft Bing Maps Tile System documentation and it is focused on raster creation. You can read a list of packages with similar functions in the README references section.pkgcheck
items which your package is unable to pass. 👍🏼 I have fixed all thepkgcheck
comments.Technical checks
Confirm each of the following by checking the box.
check
. The functionspolygon_to_raster
andformat_data
don't have valid roxygen2 examples, andpolygon_to_raster
don't have tests. At least a warning is missing increate_raster
and some functions could be potentially improved in efficiency and documentation (I would like to improve the technical terminology). Despite this, I think it is a good moment to get some feedback as I will be using the package in the next weeks and I will be available to update it.This package:
README
to each of them, not only in thepkgdown
site. They grow in complexity from goals 1 to 3. I think reading them in order should give a good idea about the package's core functionalities and logic.Publication options
stars
andsf
, which are packages that will continue to be active.MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct
Thank you!