Closed cimentadaj closed 6 years ago
@cimentadaj thanks for your submission! I'm doing the editor checks right now and... the tests take a very long time, right?
Thanks for your submission @cimentadaj! :smile_cat: The tests are now finished. ๐ I'm currently looking for a second reviewer (thanks @leeper for accepting to review!).
Please add this badge to ess
README
[![](https://badges.ropensci.org/201_status.svg)](https://github.com/ropensci/onboarding/issues/201)
Here are a few comments/questions.
Just curious, why is there a note about the package name in DESCRIPTION, was that demanded by CRAN?
The link from the pkgdown website is missing from the repo description.
goodpractice::gp
indicates a few long code lines. It seems that they are error messages so it is fine. You can escape them with # nolint cf https://github.com/jimhester/lintr#project-configuration
Please document the fact that one needs to set "your_email" as an environment variable. I moreover think its name is a bit confusing, it is my email :wink: maybe it should be "ess_account_email"?
I'd recommend you to look for the email by default like what is done in this package
Reviewers: @leeper @nujcharee Due date: 2018-03-29
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
This package is a very easy-to-use tool for download ESS data. It will be of great benefit to ESS users in both a teaching and research context. In the latter case, it will no doubt help considerably with reproducibility of analyses using the ESS. The code is generally well-written and the documentation is mostly excellent. Both the README and vignette provide useful introductions to the package and explain all functionality.
I have flagged seven issues I would like to see responses to:
These are all suggestions for improvement rather than obvious bugs or problems with the package. The only other suggestion I would have is that this package may significantly increase interest in the ESS among R users and therefore it may be valuable to expand the examples in the README or add an additional vignette that showcases some of the really interesting substantive data available in the datasets. It would be great to see a few tables or graphs showing some interesting patterns. Not necessary, of course, but that might help sell R users on the ESS in the way that the package is already helping sell ESS users on R.
Great package.
Wow thanks a lot for your extremely rapid and good review @leeper ๐ฎ ๐
Could you please fill and add the review template to your comment? Thanks in advance.
Just a suggestion as a reaction to @leeper's excellent suggestion to add examples, I tend to think minimal READMEs make more sense with a pkgdown website (especially if the README has links to articles/vignettes), so adding a vignette or several ones might be better than expanding the README.
@cimentadaj also note that I'll expect you to answer to/comment on @leeper's issues in this thread as well to make all information about your changes available here as well. ๐บ
Hi everyone, thanks very much for the comments, they're spot on. I am away from computers until Monday, where I'll review all of the suggestions. Thanks again!
On Fri, 9 Mar 2018 at 08:55, Maรซlle Salmon notifications@github.com wrote:
@cimentadaj https://github.com/cimentadaj also note that I'll expect you to answer to/comment on @leeper https://github.com/leeper's issues in this thread as well to make all information about your changes available here as well. ๐บ
โ You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/onboarding/issues/201#issuecomment-371752207, or mute the thread https://github.com/notifications/unsubscribe-auth/AHzRORuUIdba4xmEO91-c8EKcpwXvGDpks5tckNxgaJpZM4SbhTx .
--
Jorge Cimentada https://cimentadaj.github.io/ https://cimentadaj.github.io/
Thanks for agreeing to review this package @nujcharee! :smile_cat: Your review is due on 2018-03-30.
As a reminder here are links to the recently updated reviewing and packaging guides and to the review template.
@cimentadaj you can wait for @nujcharee's review before responding if you wish.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing: 6
The ESS package provides programmatic access to European Social Survey published on http://www.europeansocialsurvey.org/downloadwizard/. Users can directly download the surveys from the website in 3 different formats: SPSS, STATA and CSV. The ESS package is compact and easy to use with 2 main family functions: show* and ess*. It also offers functions that handle missing values.
There are a few functions that use external packages such as library(rvest), library(xml2)
. At the installation these packages weren't installed as they were already installed on my local machine.
The vignettes gives an introduction of what ESS is and contains a link to the website. It also demonstrates common usage and are easy to follow.
I installed and ran it successfully on Windows 7 (R-3.4.2) and Mac OS X (R-3.3.2) I used test_package("ess") and received the following message OK: 38 SKIPPED: 23 FAILED: 0
library(tidyverse)
one_round <-
ess_rounds(1, "your_email@email.com") %>%
recode_missings()
Great package, I can see the potential for it to become very popular among social science professionals.
@maelle I hope this is useful. Let me know if I can assist you further :)
Thanks a lot @nujcharee for your great review! ๐บ ๐
@leeper could you please add the review template to yours? ๐
@cimentadaj both reviews are now in! ๐
@maelle done
Thanks @leeper!
I have no background in social sciences so have no idea how ubiquitous the acronym ESS for European Social Survey is, but in the R / S community there is long precedent for ESS standing for Emacs Speaks Statistics, dating back to at least 1994, before R existed. Emacs Speaks Statistics is also quite popular for emacs R users, so this is not a niche issue.
I am actually astonished that this package was allowed on CRAN with this name and would really encourage considering changing the name to something more descriptive; maybe essurvey
would suffice, you could retain the ess_
shorthand for the internal functions.
Everyone, thanks for the suggestions/comments. I will go one by one. @maelle
This is because it has conflicts with Emacs Speaks Statistics and the Emacs authors suggested that I add it as a note and they will add it as well to their R package. In fact, the point raised by @jimhester has been raised by several other R users in this email thread. Among social scientists the ESS acronym is widely known and would be the first thing that comes to mind, yet not many of these scientists are R users, whereas Emacs Speaks Statistics is widely used/known in the R community. I've decided that it is best for the package to be renamed and rereleased in the future. In fact, I was thinking of using the same name @jimhester suggested, essurvey
. I don't know how this affects the rOpensci submission, let me know if something needs to be changed. I will make the name change effective when I have time, somewhere in the next few weeks.
Done ๐
Done ๐
Please document the fact that one needs to set "your_email" as an environment variable. I moreover think its name is a bit confusing, it is my email ๐ maybe it should be "ess_account_email"? I'd recommend you to look for the email by default like what is done in this package
Currently your_email
is not an environment variable but purely an argument. Following the comments of @leeper I think a better approach would be set it as environment/option variables but also allow the email as an argument (for backward dependency), encouraging the user not to use the email as an argument; this would be explained in detail in the vignette, which might cause confusion for ESS users not familiar with R. The approach would be similar but slightly different from the package you suggested.
Thanks @leeper! I will review your suggestions in the coming days. I think that making a more 'practical' vignette is also a good idea. I will work on it.
@nujcharee thank you very much for your comments! I agree with your suggestions and will implement them soon. In fact, I think you also raised some common points that overlap with what @leeper suggested. Thanks!
EDIT: I will post all of @leeper 's answers here:
Issue https://github.com/cimentadaj/ess/issues/20 asked for package documentation. Closed it with https://github.com/cimentadaj/ess/commit/ed0df5c4856a25233f62f1f37221f170bb7cc2be
Issue https://github.com/cimentadaj/ess/issues/21 suggested that the package should have a better documentation for show_rounds_country()
. The function accepts a numeric vector of rounds (currently from 1:8
) and returns the countries that participated in all of these rounds (that is, AND
rather than OR
). The issue suggested either improving documentation to make this clear or adding a separate function for AND
and for OR
.
See https://github.com/cimentadaj/ess/commit/76007c5ac35161629a3eefed4a941a9a726dee0d where I improved documentation as to make it clear with some examples. Adding a OR
type of function could be handy, and something that I will add for next releases yet I don't think it's urgent. Users can find a work around around along theses lines purrr::map(ess::show_rounds(), ess::show_rounds_country)
.
Issue https://github.com/cimentadaj/ess/issues/22 suggests to break the ess_*
functions into two: an import_*
and download_*
function. This is the case because all ess_*
functions have the functionality to import data into R or download data into a directory without returning anything in R. This change makes a lot of sense and I think is the right way to approach this. In fact, the implementation should be straight forward. However, I'm concerned about backward compatibility and the fact that I plan to change the name of the package to essurvey
in the near future.
My main concern is having both the breaking compatibility change in the ess
package and then changing the name of the package in a short period of time. I would want to give some time to users to communicate the API changes and let them know that the functions will be deprecated without also swamping them with a new package name (let's remember that the users of this package are not necessarily R users but people who come to R for specific things). I've thought a lot about this and I think this could be one solution:
Release the package with all the changes from the rOpensci (including this change) review as the new package name essurvey
.
This new release will contain the import_*
and download_*
functions respectively and provide a deprecated message for all ess_*
functions. This specifies that the functions will be removed from the package altogether in the next release (3 months, more or less)
Publicising the package
Remove ess_*
functions 3-4 months later.
I'm really hesitant on this step so I'd like to hear your comments. Perhaps it's not the best idea to apply these changes in the first place because the breaking changes are not extremely necessary.
I'd really like to hear your feedback on this particular point.
https://github.com/cimentadaj/ess/issues/23 gave a very reasonable suggestion to set the your_email
argument as an environment variable rather than using it in every ess_*
function call. I think this is very useful feedback! I implemented this in https://github.com/cimentadaj/ess/commit/11fcf116d1f821dcb2aeef2ed154e67cf2c32aa0
A summary of the changes:
set_email
rather than showing the email.your_email
argument name to ess_email
in all ess_*
functions.All tests pass, so it should be good to do. (Again, very sorry that the commits seem to rewrite whole documents. Something weird is happening when making changes using the Ubuntu terminal in Windows 10.)
Issue https://github.com/cimentadaj/ess/issues/24 suggested that every request made to the ESS website should be made over HTTPS rather than HTTP for security reasons.
See https://github.com/cimentadaj/ess/commit/57656ffdcbac6e156752602009210238e7e9ea4f for the implementation.
Good suggestion, I had no idea of the difference between HTTPS and HTTP. Let me know if this is fine, to close the issue.
Issue https://github.com/cimentadaj/ess/issues/25 was about using match.arg
to match the format
argument in all of the ess_
functions instead of creating custom checks.
I hadn't thought about it! Thanks. I'm applying it to the case of matching the format
argument through out the package. However, in the other cases prefer to have a separate stop
message, like here. This is because it checks whether the supplied country name is available in the ESS website and instead of match.arg
returning all the country names (long) I stop
with a clear message suggesting the user to check show_countries()
for the correct country name. Applies to rounds (and themes), with an even more explicit error.
There are other instances where it make sense such as here but it doesn't work as expect because it needs to match every code and throw an error if it doesn't. For example match.arg(c("123", "Refusal"), c("Don't Know", "Refusal"), TRUE)
only returns Refusal
but doesn't raise an error. So in these other cases I also think the error message is more strict. See https://github.com/cimentadaj/ess/commit/0336d37ae09003f25c60ea769a9fc685d67f03e7. Let me know if it addresses your concern so I close that issue.
Issue https://github.com/cimentadaj/ess/issues/26 suggested to add documentation about using weights for analyzing the ESS data. More specifically it advised either writing a second vignette showing examples or warn the user that it should use weights in the README, vignette and package documentation. I closed this via https://github.com/cimentadaj/ess/commit/d83f728a2c63a381edf3d7a4bc415e234fc87594 by including comments and notes on the weights on the three sources of documentation. Implementing a second vigenette, together with more general examples using the weighted ESS data is something I have on my to-do list. In fact, I haven't created these examples because I'm currently working on developing a package to analyze the ESS. Once I have that done I'll create more vignettes for both packages.
PS: Sorry for the whole document rewrite in the commit. I'm getting these weird bugs using the Ubuntu terminal in Windows 10.
In terms of the particulars of renaming the package, I think the best way to do it is to submit the package to CRAN with the new name, and in your comments to CRAN in the submission say you want to rename the package and archive the old name along with the reasons for doing so.
@cimentadaj changing names doesn't impact the submission. I know that because my Ropenaq
package was renamed ropenaq
during its onboarding ๐ (I didn't have it on CRAN but you have @jimhester's guidance above, by the way thanks Jim for chiming in!)
@cimentadaj great job! ๐ Just a note about the process, it'd be easier for future readers of the thread to group your answers and to write here what the issue was about so that one doesn't need to click. Moreover if you want to link a commit to an issue in the commit message you can write something like "blabla closes #" with the issue number (or just "see" instead of "close" if you don't want to close it yet)
Thanks @maelle ! I just edited the messages to group all of my answers in one single comment. I will update that comment as I fix the issues.
Here I will respond to @nujcharee's suggestions.
In https://github.com/cimentadaj/ess/issues/27, @nujcharee suggested adding library(tidyverse)
to the README
. Implemented in https://github.com/cimentadaj/ess/commit/25628bc99ff679f27ce428af105d425fd60e277e.
Some other suggestions.
In my opinion, for those which aren't familiar with Social Survey, it might be really beneficial from a brief description of what the term "round" mean and how this is used in the context of social survey. Thank you for this! Implemented it in https://github.com/cimentadaj/ess/commit/f1c730069b3f9ecd63ee56c0f735af9c7f7f1441
Opportunities to apply global variables for "email address": applied in https://github.com/cimentadaj/ess/commit/11fcf116d1f821dcb2aeef2ed154e67cf2c32aa0 from https://github.com/cimentadaj/ess/issues/23
Opportunities to merge functions "ess_all_rounds()" and "ess_rounds()" where "all" is default if country variables aren't passed. I'm not sure I get this @nujcharee. ess_rounds
returns the complete data for all countries in the specified round. ess_all_rounds
is just a wrapper around ess_rounds
that downloads all rounds for all countries.
Making the country_name argument case insensitive is a potential improvement. This is indeed a potential improvement. I'll put it on as an issue for the next release in https://github.com/cimentadaj/ess/issues/28
@maelle, my answer's to @leeper and @nujcharee are in.
I'm happy with all of the responses to my comments.
@nujcharee are you happy with @cimentadaj's answer?
@maelle @cimentadaj I am also happy yes ;)
I think in terms of the function โallโ I think personally that โallโ function can be obselete if it can be embedded in ess_round function when countries arenโt specified.
I made a suggestion as a recommendation. Looking forward to seeing this on CRAN soon good luck.
Thanks a lot @nujcharee! @maelle, if the package gets accepted we need to sort out the issue with name of the package. Right now the name of the package is ess
both in CRAN and on Github. During this week I'm going to finish all of the changes from the review and change the Github package name to essurvey
and resubmit to CRAN in the next weeks. Perhaps the package can be essurvey
in rOpensci while I resubmit it to CRAN with that name.
Thanks @nujcharee and thanks @cimentadaj!
I still need to perform the last editor checks (should lead to minor comments only). Maybe it's best to have waited for you to have finished all review changes before I have a last look anyway, so ping me when it's done if that's ok with you?
We could transfer even before the name change, I think redirections from the old account/reponame would still work if it is newaccount/newreponame. But I haven't checked that yet.
Ok @maelle , the changes are in and the package has be renamed. I switched it to essurvey
on Github and everything on the package has been renamed to essurvey
over ess
. All tests pass so I'm just waiting for the badge update. I will submit to CRAN once this review process is over to make the process much smoother.
Approved! Great work @cimentadaj and great reviewer work @leeper & @nujcharee ! A few last comments/questions from me
Have you checked whether it's ok to use the ESS logo?
Only a suggestion, you could choose to add the reviewers to DESCRIPTION as "rev", with their permission. See this recent blog post
the install.packages
instruction is currently wrong, maybe you should remove it and add it back when the package is on CRAN?
Add the pkgdown link to the repo URL field and also to DESCRIPTION (after a comma)
Now here is the list of things you have to do before I close this issue ๐
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. ((https://ropensci.org/blog/). If you are, @stefaniebutland will be in touch about content and timing.
And thanks again @jimhester for chiming in! ๐
@maelle thank you very much for the review. Thanks as well to@leeper and @nujcharee for some great reviews. essurvey
benefited a lot from your comments.
Answer's to @maelle's comments below:
Have you checked whether it's ok to use the ESS logo? Hmm, I'm not entirely sure but the communications office of the ESS has been retweeting the package constantly in their official account, so I'm guessing there shouldn't be a problem. Here's one example but there's plenty more.
Only a suggestion, you could choose to add the reviewers to DESCRIPTION as "rev", with their permission. Done! @leeper and @nujcharee would you like to be listed as reviewers? @nujcharee is your complete name Nujcharee Haswell?
the install.packages instruction is currently wrong, maybe you should remove it and add it back when the package is on CRAN? Yeah, I forgot, thanks!
Add the pkgdown link to the repo URL field and also to DESCRIPTION (after a comma) I added the pkg website to DESCRIPTION for I can't seem to find a URL field in the repository. What do you mean exactly?
@maelle, finalized all changes to close the issue. You can check out the repo at https://github.com/ropensci/essurvey
Thank you all for this experience! I learned a lot from it and I'd love to write an extended blog post on the review process and how to use the package @stefaniebutland.
@cimentadaj you couldn't see the repo description "edit" button because you were not an admin yet. Now I've made you admin, and I've added the pkgdown URL in the description at the same time, hope that's ok.
@maelle yes, that's fine! Thanks.
@cimentadaj Hi there yes thats my name Nujcharee Haswell. :)
@leeper would you also like to be listed as a reviewer in the CRAN submission?
Glad to hear you're interested it writing a post @cimentadaj. Here are some technical and editorial guidelines: https://github.com/ropensci/roweb2#contributing-a-blog-post
We ask that you submit a draft via pull request at least one week before the planned publication date. At this point I have posts scheduled through April so perhaps best for you to suggest a deadline by which you would like to submit a draft.
Thanks @stefaniebutland! I could send a pull request the somewhere between the 20th and 30th of April for review where the post would land perfectly for sometime in May. Let me know if this is alright.
Sounds good. I will ping you around April 17th to make sure your draft is still on track for submission between 20th and 30th.
Closing this now that the package has been onboarded. @leeper can still answer about being listed as "rev" in DESCRIPTION โบ
Hi @leeper, just a reminder, would you like to be listed a reviewer when I submit the package to CRAN?
Sure. Happy to be.
Summary
It allows users to download and explore data from the European Social Survey directly into R. Broadly speaking, it has two family of functions. The
ess_*
family, which downloads data from specific waves and countries and theshow_*
functions which allows the users to check which waves/countries are available.URL for the package (the development repository, not a stylized html page): https://github.com/cimentadaj/ess
Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):
[e.g., "data extraction, because the package parses a scientific data file format"]
data retrieval, because the main objective of the package is to download a set of datasets widely used in the social sciences.
Mainly academics working in disciplines such as Economics, Demography, Psychology, Political Science and Sociologists. Hundreds of papers have been written using the European Social Survey, a survey carried over in over 30 countries for over 15 years that asks citizens everything from their health status to their political preferences. This package will allow full reproducibility for many of the academics using the survey and equally important, will remove the burden of manually exploring/downloading the datasets from the ESS website.
The lodown package (not on CRAN) has one function for downloading the data as 'Stata' or 'SPSS' files for only specific waves. The package is not intended specifically for the ESS data. This package differs because it allows the user to download the files directly into R selecting different wave/country combinations. It also supports downloading the files in 'Stata', 'SPSS' and in 'SAS' and offers a range of functions for exploring the available countries/waves directly from R.
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
[X] Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings: It succeeds with no errors/warnings.[x] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
It conforms.
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
I don't know anyone who has experience with this type of packages but I assume anyone with relevant experience in web scraping is adequate. The workhorse of the package is scraping the ESS website, reading it with the
haven
package and putting the data in a suitable format for R.