Closed Dr-Nathan-Fox closed 4 years ago
Thanks for your submission, @nfox29! I have a question regarding the relationship of this package and FlickrAPI. A quick review suggests to me that these packages cover different, but complementary parts of the API. Is that correct, or is the difference in how API outputs are handled?
I ask. We generally aim for RO packages to be best-in-category because we want the RO package to be the one we'd recommend for the task. In this case, it seems the best outcome would be to combine the functionality of the two packages, either by joining forces, or you adding the parts of the API that FlickrAPI has, so that we have one best package for accessing Flickr. Would you agree or do you think the packages handle very different applications?
Thank you @noamross for your swift reply. I appreciate that there seems to be similarities between the two packages.
flickrAPI has four functions getExif, getHotTags, getPhotoInfo and getPhotos. Out of those four we currently do not have getExif, getHotTags and getPhotos.
The main function we share is searching for photo metadata. We use the API method flickr.photos.search which allows users to search for photos based on their own arguments including tags and spatial location. flickrAPI on the other hand used the API method flickr.photos.getInfo. The difference with this method is that it only allows user to search for information on one specific image. The argument to search for a specific image by its ID can be incorporate into our method. Second the way that data is returned is handled slightly differently. The use of our method allows users to return all available metadata on an image to a single dataframe, whereas the call made by flickrAPI returns limited data for one of "Location", "Date", "URL", "Tags".
Of the functions that we do not have in our package we have chosen not to include the getHotTags function as this API method does not work Flickr side and will only ever return the same result regardless off the arguments. With the other two, flickrAPI doesn’t utilise the full range of arguments available to the calls (for example the getPhotos function in flickrAPI only has one augment, whereas the API method has many more). If you deem it necessary to include these functions I will add them to this package with the greater functionality available on the API.
Besides that – our package has additional API methods, including downloading images, getting the top tags for locations and assessing related tags. Our package also has some quality of life benefits, such as storing you API key so that you do not have to enter it as an argument to the functions every time.
I will now add the option to search for a photos data via its ID number to the photosearch function, as well as adding the two other functions.
Hi @noamross, to address the issues you raised I have now updated the package to include the functionality that flickrAPI included that we originally did not. Our package can now:
get_photoinfo
functionget_exif
functionphoto_search
functionI have also updated testing, docs and vingettes to match these updates.
I believe our package now covers the complete functionality of flickrAPI, whilst providing additional functions and returning the data in a more complete way. I hope these resolves the issue and the package can be considered for submission. Kindest regards, Nathan
Great, thanks you @nfox29. @sckott will be your editor.
Thanks for your submission @nfox29 !
License: you have GPL
in your license file. I'm not sure I've seen that before. I usually see GPL-2, GPL-3, etc. - see https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Licensing - Did you mean GPL
per se?
Here's the output from goodpractice. If you haven't used goodpractice
it's an R package that checks a number of things with another package - most of which we agree with and want authors to follow. You don't need to address these now, it's info for reviewers to use to get started.
── GP photosearcher ─────
It is good practice to
✖ write unit tests for all functions, and all package code in general. 68% of code lines are covered by test
cases.
R/download_images.R:81:NA
R/download_images.R:82:NA
R/download_images.R:83:NA
R/download_images.R:136:NA
R/download_images.R:137:NA
... and 121 more lines
✖ avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about
80 characters wide. Try make your lines shorter than 80 characters
R/photo_search.R:108:1
R/photo_search.R:180:1
R/photo_search.R:184:1
R/photo_search.R:190:1
R/utils.R:32:1
... and 1 more lines
───────────
Seeking reviewers now 🕐
Reviewers:
Thanks @sckott to address your comments I have fixed the line length issues. The reason currently some lines aren't covered by tests is due to a Flickr side issue. Currently anything that requires a Flickr location identifier does not work. Therefore both the location_tags
and find_place
functions do not work, as well as searching for photographs with a woeID using the photo_search
function. Because of this any test that are related to those functions are currently being skipped until Flickr resolves the issue. More generally there are also a few lines of code that catch errors that I can not replicated in tests.
Kindest, Nathan
thanks
reviewers assigned: @monicagerber @zkamvar
Hi @nfox29! 👋 I have a question about getting started with your package. In the README, you write:
The package requires a valid API key from the Flickr development page. The first time you call a function from the package, you will be prompted to create and enter your API key. The API key will then be saved as a .txt file and be called when using any other function.
Do you have an example of how the user will be prompted to create and enter an API key? I've installed your package and tried out photo_search()
but wasn't prompted to create a key. I went ahead and created an API key from the Flickr development page, but as a naïve user, I'm not sure where to save the key.
@monicagerber @zkamvar reminder your review is due on Monday (Monica, I do see you've got started, thanks)
@monicagerber when you use the photo_search
function (or any of the other packages functions) you should be prompted to paste or type your API key into the console. All functions check whether an API key exists as a a file called api_key.txt
and if not it should prompt you to enter it, whilst opening the Flickr development page to facilitate requesting an API key if you do not have one. It is then saved to be called upon for future functions. If an API key has been entered but is not a valid API key it will warn you that the API key is not valid and stop the function. Users are also able to save their API key directly as a file api_tey.txt
. Let me know if you have continued issues with this and I will look into why it has not worked for you. Kindest regards.
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 4
My strategy for reading through this was to do the following:
I have installed the repository from the most recent commit, which was here:
repo <- "nfox29/photosearcher"
commit <- "@93895459fbde7a50b87ce0717a75e383ffc245dd"
remotes::install_github(paste0(repo, commit), dep = TRUE)
Overall, I think this is a good start, but it still needs some polishing. The
main function of this package seems to be photo_search()
, which works well
once you have your API key set up (difficulty: not very). I was able to search
for snakes in the UK and got them in abundance, so the package works with that
expectation.
Based on my walkthrough of the documentation I think the entrypoints for the API make sense, but the output could use significant improvement.
For example, the output of photo_search()
has 57 columns, but the
documentation does not state exactly what they represent. I could tell that 30
of these columns were dedicated to different iterations of width, height, and
URL for the results with cryptic suffixes. Moreover (as mentioned later), all
of the columns were encoded as character despite the fact that there were
integer, numeric, date, and boolean columns.
Moreover, the get_photoinfo()
returns similar columns, but the dateuploaded
column is not formatted as the date columns of photo_search()
(Y-m-d H:M:S),
but as an integer formatted as a character. The only other function that returns
a data frame user_info()
converts characters to factors.
The other functions all return character vectors with the exception of
download_images()
, which returns a data frame.
My recommendations for improvement:
photo_search()
photo_search()
, user_info()
, and get_photoinfo()
before returning it to the userdownload_images()
and throw
an error if the user doesn't supply one (this is one of the CRAN policies).Stray observations:
photo_search()
have a cascading indentation
pattern and will fill the user's console with verbose data frame output.
Moreover, they might benefit from some comments describing what they are
doingget_exif()
and get_photoinfo()
do not workapi_key.txt
is a bit ambiguous a name for the key. I'm wondering if it
would be better to use the package name and make it a hidden file along the
lines of .httr_oauth
(e.g. .photosearcher_key
)vcr
package can
help with the testing coverage. The example in the README does not work out of the box:
library(photosearcher)
rock_climbing <- photo_search(
mindate = "2010-01-01",
maxdate = "2018-01-01",
text = "rock climbing",
bbox = "-12.875977,49.210420,2.636719,59.977005",
has_geo = TRUE
)
#> Error in photo_search(mindate = "2010-01-01", maxdate = "2018-01-01", : argument 1 matches multiple formal arguments
Created on 2019-08-28 by the reprex package (v0.3.0)
I found that fixing mindate
and maxdate
to have _taken
suffix, addressed
the problem. When I ran the command with the suffixes, I was prompted to find my
API key from flickr. I did not have one, so I used a throwaway account to
generate one. I copy+pasted the key (not the secret) into to console as it
prompted me to do so and the search for eight years of rock climbers in the UK
and Ireland produced fruitful results, taking only a few seconds to search.
The example of plotting the geographic coordinates of all the rock climbers in that eight year period was helpful and it worked as expected.
I wanted to see if I could show my spouse that yes, snakes did exist in the UK despite the fact that we haven't seen one while hiking in the last year and was able to produce a map of all photos tagged with "snake" in the last year.
When I tried to color the points by number of comments, I got a categorical variable instead of a continuous because all the columns were encoded as character. I found that this affected the SF object because it could not interpret the character coordinates as a bounding box for the print method:
Simple feature collection with 148 features and 55 fields
geometry type: POINT
dimension: XY
bbox: Error in signif(attr(x, "bbox"), options("digits")$digits) :
non-numeric argument to mathematical function
Things worked well after I used the following pipeline to convert relevant columns:
library('dplyr')
snakes_clean <- snakes %>%
mutate_at(vars(starts_with("count")), as.integer) %>%
mutate_at(vars(ends_with("itude")), as.numeric) %>%
mutate_at(vars(starts_with("height"), starts_with("width")), as.numeric) %>%
mutate_at(vars(starts_with("date")), readr::parse_datetime)
The code from the vingette did not work out of the box for the following issues:
_taken
suffixtags = "lake"
related_terms()
was renamed to related_tags()
I did not attempt the search by layer because I did not have an SF layer on hand.
I would recommend to make the layer you have in the tests available to the users
through something like system.file()
so that they have a chance to run it if
they don't have a convenient layer to work through the example.
I think the download_images()
demonstrataion needs to be modified a bit. If
the user naively walks through this and copy+pasting code, they will end up
downloading over a gigabyte of photos to a directory called "Fox images"
situated in their current working directory.
I would suggest the following change:
# Downloading the first ten fox images from photosearch
temp_fox_dir <- dir.create(tempdir(), "foxes")
download_images(photo_id = foxes$id, saveDir = temp_fox_dir)
# Download a specific images based off its id
download_images(photo_id = 47416147181, saveDir = temp_fox_dir)
# View the images in the directory
if (interactive()) browseURL(temp_fox_dir)
Important: The download_images()
function explicitly creates a directory
in the user's current working directory called "downloaded_images" or simply
attempts to write into this directory if this exists. It would be better to
throw an error if the user doesn't specify a directory.
The related_tags()
function works as expected
The example for get_photoinfo()
appears to be duplicated.
thanks for your review @zkamvar !
Hi @nfox29 and @sckott. I am nearly done with the review for the package but need until tomorrow to finish up. I got caught up with the Flickr API key. I realized that the issue was that the in the example in the README, the argument names should be mindate_taken
and maxdate_taken
instead of mindate
and maxdate
. The API key functionality works for me.
thanks for the update
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 7
A statement of need
The package also supports a number of other Flickr API call methods including the flickr.tags.getRelated and flickr.places.tagsForPlace methods.
Vignettes
Main vignette:
mindate
and maxdate
arguments do not work.download_images(photo_id = 47416147181)
: Not able to download.user <- user_info(user_id = "155421853@N05")
: Not a great example because the hometown is empty.Article:
Function documentation
interesting_list()
(not on package website and documentation seems incomplete), findPlaces()
, and location_tags()
. related_terms()
is not exported, but related_tags()
is. I think the related_tags()
name is more informative, but can you give more info on how relation is determined? Is it relation to tags or other text associated with image?@return
for photo_search()
function and others. Link to data dictionary for data.frames with @seealso
.data.frame. Output consists of 57 variables including; latitude and longitude of photograph, date and time it was taken, associated tags and image urls.
A data.frame with 57 columns including latitude and longitude of the photograph, the date and time it was taking, and the associated tags and image urls.
exif()
: Can you link to more information about what exif means? How is this different than get_photoinfo()
?Examples
download_images()
:
find_place()
, location_tags()
: unable to test because Flickr locations services are downget_exif()
, get_photoinfo()
: Should use real photo_idphoto_search()
: code style/indentation could be improvedCommunity guidelines
Functionality
Performance
Packaging guidelines
Other comments
photo_search()
:
@seealso
Roxygen tag.text
argument with Boolean logic? For example, how would you do a search for "rock climbing" but not include results with "trad"? Or "rock climbing" AND "sport"? tags
: Be more specific, do you mean a character vector? find_place()
and woe_id()
are related functions, could add link to them where you mention them under woe_id()
I think you can do this with \code{\link{functioname}}
, see chapter 5 in R packages book.download_images()
:
photo_id
: Could be more specific about class. A numeric vector? A character vector also works.thanks very much for your review @monicagerber !
@monicagerber @zkamvar thank you both for your reviews, they are extremely helpful. Apologises for the errors in the examples that was an oversight on my behalf. I will now work through your suggested changes and will reply once I have finished. Kindest regards, Nathan.
Hi @sckott @monicagerber @zkamvar, again thank you all for your comments.
I have addressed all of the comments you made but the main highlights of things I have changed are:
photo_search
and other functions now parse the outputs correctly
photo_search
now has the option for tags_any
which can be used to select whether a user wants photos that contain any given tag, or all given tags
All documentation and vignette examples should now run
download_images
now has the option for users to overwrite files with the same name or skip these files: output update to inform users which files are new or replaced
download_images
no longer has a default directory
test coverage is now at 85% (lines not covered related to an on going issue with the Flickr location services, if they decide to discontinue their unique location services these functions will be removed)
Documentations has been re-worked (though @monicagerber if you believe it still needs work please do not hesitate to say)
The only comment that I haven't addressed is
I think it would be helpful to rename the functions in this package using the object_verb() naming scheme (see section 1.4: Function and argument naming).
Though I agree the object_verb() naming is a good convention, here I have elected to keep the same naming style as the Flickr API. I believe that this will facilitate the use for both current and new users of the API.
Please let me know of any additional changes I need to make or any extra steps I need to complete for the submission.
Kindest regards, Nathan
thanks very much for your response @nfox29
reviewers, @monicagerber and @zkamvar - are you happy with the maintainers response? Any other questions/suggestions?
reviewers, @monicagerber and @zkamvar - are you happy with the maintainers response? Any other questions/suggestions?
I am happy with the response. I think @nfox29 did a thorough job addressing our comments and it has improved the package experience.
I only have a couple of suggestions/questions for the documentation/behavior of photo_search()
:
photosearcher_key.sysdata
in their working directory, nor that it will be explicitly written to the directory when they enter their api key. It would be good to have this documented in the photo_searcher()
function at the least. thanks @zkamvar !!
@monicagerber you happy or further comments?
Hi @zkamvar,
I have addressed all of your additional comments:
@sckott @monicagerber is there any additional changes I need to make. I am currently in the process of finishing my manuscript for MEE and would like to send the manuscript off after the rOpenSci process has been completed.
Kindest regards, Nathan
LGTM. Thank you for addressing those, @nfox29!
Thanks @zkamvar - I'll take one last quick look ...
Approved (with some caveats, see below)! Thanks @nfox29 for submitting and @zkamvar & @monicagerber for your reviews!
caveats: on check I see:
checking Rd cross-references
, check is throwing warnings about missing links, are you not getting that?Files named as vignettes but with no recognized vignette engine
note about the vignettes/flickr_in_research.Rmd
file. shouldn't that have a vignette engine?To-dos:
CODE_OF_CONDUCT.md
file, see usethis::use_code_of_conduct
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
"pkgdown
URL to point to https://docs.ropensci.org/package_name
and deactivate the automatic deployment you might have set up, since it will not be built centrally like for all rOpenSci packages, see http://devdevguide.netlify.com/#docsropensci. In addition, in your DESCRIPTION file, include the docs link in the URL
field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
DESCRIPTION
via rodev::add_ro_desc()
.[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)
.Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"
-type contributors in the Authors@R
field (with their consent). More info on this here.
We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.
Hi @sckott,
I have fixed the caveats and am working through the on boarding process. I am however unable to transfer the repo to rOpenSci's "ropensci" GitHub organization. I have attempted to transfer it under settings but i get the warning
you don’t have the permission to create public repositories on ropensci
Please can you advise, Kindest regards, Nathan
@nfox29 you should get an email invitation to a github team, once you accept that you should be able to do so
Submitting Author: Name (@nfox29)
Repository: nfox29/photosearcher Version submitted: 1.0 Editor: @sckott
Reviewer 1: @monicagerber
Reviewer 2: @zkamvar Archive: TBD
Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
The package provides functions that make calls to the Flickr API to return metadata (including georeferenced location) for photographs in a reproducible manner.
The package is aimed to facilitate reproducible code for scientific research that utilises the Flickr API. Currently, a wide range of scientific disciplines utilise the Flickr API including environmental sciences, biological sciences and medicine. Flickrs georeferenced images have a wide range of scientific applications including ecosystem service assessments, resource management decisions and mapping species distributions.
We are aware of one other package that makes calls to the Flickr API. However this package only allows user to search for a single image based on its ID number. Our package provides users with the ability to search for photographs based on a user defined criteria including, date of the photograph being taken, tags, text and spatial location. Furthermore our package provides additional functionality including the downloading of images, retrieving social data about users as well as get tags related to a specific location, or other tags.
Technical checks
Confirm each of the following by checking the box. This package:
Publication options
JOSS Options
- [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)MEE Options
- [x] The package is novel and will be of interest to the broad readership of the journal. - [x] The manuscript describing the package is no longer than 3000 words. - [x] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct