Closed whedon closed 2 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @kamapu, @tom-gu, @DrMattG it looks like you're currently assigned to review this paper :tada:.
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Wordcount for paper.md
is 1478
Software report (experimental):
github.com/AlDanial/cloc v 1.88 T=0.05 s (357.9 files/s, 141975.3 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Rmd 2 271 520 2140
R 9 175 1233 1355
TeX 3 31 0 522
Markdown 2 100 0 312
YAML 1 15 1 68
-------------------------------------------------------------------------------
SUM: 17 592 1754 4397
-------------------------------------------------------------------------------
Statistical information for the repository 'bd838b7115269e827ea9c7cb' was
gathered on 2021/09/29.
The following historical commit information, by author, was found:
Author Commits Insertions Deletions % of changes
jessicatytam 2 19 19 100.00
Below are the number of rows from each author that have survived and are still
intact in the current revision:
Author Rows Stability Age % in comments
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1073/pnas.0507655102 is OK
- 10.1111/tbed.12221 is OK
- 10.1111/mam.12066 is OK
- 10.1007/s11192-015-1798-9 is OK
- 10.1007/s11192-007-1859-9 is OK
- 10.1371/journal.pone.0131004 is OK
- 10.1111/mam.12038 is OK
- 10.1038/s41598-017-09084-6 is OK
- 10.1139/facets-2016-0011 is OK
- 10.1007/s11192-006-0147-4 is OK
- 10.1017/S1367943004001799 is OK
- 10.1371/journal.pone.0189577 is OK
- 10.1007/s11192-006-0146-5 is OK
- 10.1016/S0169-5347(01)02381-3 is OK
- 10.1038/s41467-018-07916-1 is OK
- 10.1111/acv.12586 is OK
- 10.1038/s41477-021-00912-2 is OK
- 10.1016/j.gecco.2018.e00423 is OK
- 10.1007/s11160-019-09556-0 is OK
- 10.1111/j.1523-1739.2006.00616.x is OK
- 10.1111/j.1523-1739.2010.01453.x is OK
- 10.1108/07378830610715473 is OK
- 10.1093/eurheartj/ehx446 is OK
- 10.1108/14684521211209581 is OK
MISSING DOIs
- None
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
:wave::wave::wave::wave: @jessicatytam @kamapu @tom-gu @DrMattG, this is the review thread for the paper. All of our communications will happen here from now on.
All reviewers have checklists at the top of this thread with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.
The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#3776
so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.
We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use Whedon (our bot) to set automatic reminders if you know you'll be away for a known period of time.
Please feel free to ping me (@Bisaloo) if you have any questions/concerns.
:wave: @tom-gu, please update us on how your review is going (this is an automated reminder).
:wave: @DrMattG, please update us on how your review is going (this is an automated reminder).
:wave: @kamapu, please update us on how your review is going (this is an automated reminder).
Thanks for your comments, @DrMattG!
:wave: @kamapu, @tom-gu, could you please give us an update on your progress for the review of this package & paper? Please let me know if you encounter any issues.
Thanks for your comments, @DrMattG!
👋 @kamapu, @tom-gu, could you please give us an update on your progress for the review of this package & paper? Please let me know if you encounter any issues.
👋 @Bisaloo, I'll submit my review by tomorrow. Sorry for the delay, it's due to the occurrence of too many unexpected events :sweat_smile:
Shame on me! Though too late, I'll provide my comments. Unfortunately I don't have any access or copy to the manuscript now.
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@kamapu, you should be able to access the manuscript via the links in the previous comment :slightly_smiling_face:
@kamapu, you should be able to access the manuscript via the links in the previous comment slightly_smiling_face
Thanks! The robot is closing my issues in the repo of specieshindex. I'm doing something wrong or it is because of been too late.
You need to manually open issues in https://github.com/jessicatytam/specieshindex - do not use the GitHub "convert to issue" feature, as it opens issues in the wrong repository.
@jessicatytam @wcornwell @Bisaloo Here is my full review, so sorry for the delay!
The specieshindex
package is showing great potential to promote scientific merit. It is a delightfully user-friendly package with superb documentation. Nonetheless, I identified several areas that can be improved. I did not open specific GitHub issues since they are more systemic and should be addressed as the package matures. Additionally, I had a few minor suggestions and some ideas for future development.
I saw that you recently added several tests and a CI (via GithubAction). It is great to know that you are adopting a more test-driven development (TDD) schema. Pivotal tests for this particular package are smoke tests to monitor API connection and to validate a proper output structure. As you know by now, constructing this for Scopus is easier than for WOS and BASE. However, please refer to this section to learn how to manage secrets (e.g., API keys) within the package.
The level of duplication (code & function documentation) is off the charts. It partially stems from a segregated function strategy and from not fully utilizing sub-functions (internal, non-exported functions). This is problematic from the POV of a package developer since even the smallest changes can become a real hassle. When the small things become exhausting, the big things become impossible! I’m advocating for the consolidation of similar functions by adding a parameter or two. This might slightly affect the user’s experience, but it will greatly serve package maintenance and improve its overall structure. If you decide to make the changes, please avoid putting all the functions in the same R file.
Going forward, I encourage you to review your coding efficiency. Few examples:
Communicating with our users via errors, warnings, and messages significantly improves their user experience and can help mitigate knowledge gaps. Currently, the package has no implementation of defensive programming. I encourage you to embrace it. Here you can find helpful information. Side note: implementing defensive programming is much harder when you have high levels of code duplication.
Package maintenance is crucial for establishing software sustainability. This aspect of package development is a demanding God. Adequately quantifying long-term maintenance is rather challenging, even more so for scientists. When trying to connect the dots, it seems that this package was mainly developed for the study of biases in mammalian. If this is the case, I wonder if the Cornwell Lab has a long-term maintenance plan with sufficient resources to support it. Please forgive me if the query seems intrusive, I’m only trying to share insights from my own experience and pain. It might be beneficial to instigate a collaboration with an organization or additional PI that is going to heavily rely on this package. Perhaps @mjwestgate will have specific leads.
I can classify this package under ‘data scientists developing a package’ scenario. It’s clear that you completely understand your user, but do not yet fully comprehend your development needs. In order to support your package development journey, I urge you to read rOpenSci Packaging Guide and R Packages. In a way, now is the perfect time for that. Now, you can properly digest these comprehensive resources.
goodpractice
outputSee: https://github.com/MangoTheCat/goodpractice
Please run goodpractice::gp()
or see this RStudio Project for the goodpractice output.
Why you need all of this: https://github.com/jessicatytam/specieshindex/blob/9698952bf81d8d549b7c5bef3a4a557f342bf13d/README.rmd#L22:L31 It’s somewhat redundant, why not just:
install.packages("remotes")
remotes::install_github("jessicatytam/specieshindex")
Code is too long in lines 76, 109, 144-147, and 150. https://raw.githubusercontent.com/openjournals/joss-papers/joss.03776/joss.03776/10.21105.joss.03776.pdf
Package contributors might be missing (compared to paper authors).
I see that both Jessica and Will are committing directly to the master branch. After pkg release, it’s better to have a dev branch and to merge code via PR.
The overall level of package documentation is excellent. Creating a website via pkgdown
will beautifully showcase it. Implementation is quite easy and totally worth it!
taxize
to transparently generate a list of synonymsSharing as many insights as I can is not just my duty but also a privilege, as it makes my own hardships along the way even more worthwhile. I hope you will find this review helpful and not too preachy 😃
From one Biodiversity Jedi to another, may the foRce be with you! Tomer
Sorry for the late response. It is my first review of a package and the way to proceed has been challenging for me. Also some "spontaneous" activities were kipping me busy and I just managed to finish my review.
While this package seems to be promising for its use also in the context of my own work, I see the necessity of a major revision of both, the structure of functions as well as in the documentation.
I didn't opened issues for this review. I would suggest to the authors when responding to refer every change to a commit in the repository.
I sum up my suggestions here (detailed comments below):
Count*()
and Fetch*()
. In the best case, define just two functions with all possible variants set through the arguments.specieshindex
.remotes::install_github("jessicatytam/specieshindex", build_vignettes = TRUE, dependencies = TRUE)
build_vignettes = TRUE
. If you are fearing, the users may not have any LaTeX distribution, you can rather use HTML as output format. Users may still need to install Rtools and pandoc, though.vignette
.CountSpT()
and CountSpTAK()
.FetchSpT(db = "scopus", genus = "Bettongia", species = "penicillata")
I get an error:
Error in get_results(query, start = init_start, count = count, verbose = verbose, :
Bad Request (HTTP 400).
FetchSp*()
or CountSp*()
cannot be assigned to an R object. This is very restrictive in the case that you like to implement these functions in statistical assessments or reports.FetchSP*()
functions should share the same documentation page. The same applies for CountSP*()
.
In fact, I would even be more radical and just define two function count()
and fetch()
(names just as examples), where everything else (either looking for species or genera, in title, in keywords, in abstract, in which database, etc.) can be set as argument in ad-hoc function parameters.Count*()
and Fetch*()
to an object (e.g. by obj <- Fetch*()
). I also wonder that any of the examples do assignments in the manual and all functions that are supposed to work on outputs of these functions actually work on data already installed by the package. I know that such kind of queries may fail when building the package but you have the option of writing your examples within a \dontrun{}
environment. I will prefer to provide examples to the users starting by the query.tcltk
and store the entries in the temporary folder (see this post) or using the package keyring
(see here).:::
and ::
which may be avoided. The second case, as far as I know, may overcome the declaration of dependencies, which is not desirable.no visible binding for global variable ...
. In this case I'll recommend to use NULLing (see this example).goodpractice::gp()
Thanks everyone for your helpful comments! I've since updated the documentation, reorganised the functions, and fixed some of the warnings that were showing up in checks. We're currently working on adding more tests for the API.
In regards to @kamapu 's comments:
Fetch*()
and Count*()
functions with Fetch()
and Count()
to make things simplerBad Request (HTTP 400)
is probably a result of your institution not being a subscriber of Elsevier/Scopus (see https://github.com/jessicatytam/specieshindex/issues/46#issue-1029956107)obj <- Fetch()
and obj <- Count()
are not working, do you mean they are returning some sort of error? They are both working for me, do you want to provide a reprex?Great, @jessicatytam, please let us know when you have made progress on various points raised in the review. Feel free to open an issue for each item if helpful :relaxed:
@kamapu, I think you missed this question :wink::
when you say obj <- Fetch() and obj <- Count() are not working, do you mean they are returning some sort of error? They are both working for me, do you want to provide a reprex?
@jessicatytam, do you have a timeline on when you think you could do the required changes? If you don't think this will be possible anytime soon, we can pause this submission or offer to resubmit later.
Hi @Bisaloo apologies for the holdup! I've been working on multiple projects and getting organised as I'm starting my PhD. We hope to finishing addressing the comments by the end of March. Here's a list of tasks we will be working on:
facet_wrap
in plotsHi @jessicatytam :wave:, could you update us on the current status of this submission please? If you need some more time, it's okay but would you have a rough estimate of when we could expect your changes? If this is going to take a while, you could also withdraw this submission for the time being and resubmit later.
We are flexible. Please let me know what is the most reasonable and convenient option for you.
We have completed all of the tasks and addressed the comments, with the exception of adding tests, it's currently ~50% so we will probably need until the end of May if that's ok.
What would be the ideal % for test coverage?
We have completed all of the tasks and addressed the comments, with the exception of adding tests, it's currently ~50% so we will probably need until the end of May if that's ok.
Sounds good :+1:
What would be the ideal % for test coverage?
There is no hard and fast rule, especially for API packages that are somewhat more difficult to write tests for. Most best practice guides say something like "major functionality is covered by tests". In my experience, around 80-85% is usually a good number to ensure tests cover most sections of your code. Higher is also great but please let me know if you have issues reaching this number.
For what it's worth, a good and lower-effort way to improve your test suite in the future is also to turn each bug (such as the ones discovered in this review) into a test so you can make sure you don't break this again in the future.
We've got through adding lots of tests and almost everything is now tested except for one big issue.
We're dealing with some issues with testing the functions that query an API multiple times. Since there are limitations in the scopus API, we need >10 GET requests in a single function to retrieve all of the data while handling edge cases smoothly. We are using httptest::with_mock_api()
to test the API functions currently, but with_mock_api()
can only test 1 GET request per test. So to test everything in those functions is difficult given the combination of the constraints of the Scopus API and the mock_api
framework. So we're a bit stuck on that unless you can see a clever solution?
Here's the link the codecov: https://app.codecov.io/gh/jessicatytam/specieshindex/blob/master/R/specieshindex.R . Check out the function FetchT_scopus
which is being tested but the test exits (successfully) after the first GET call. Any help much appreciated.
@jessicatytam, while investigating your problem, I had another look and I want to already now mention that @DrMattG's point about code duplication is not yet properly addressed in my opinion.
Most functions still include a lot of duplicated code. I have submitted a pull request (https://github.com/jessicatytam/specieshindex/pull/60) where I show how you could reduce code duplication. Please have a look and try to apply this kind of fixes throughout the code. Let me know if it's not clear.
On the topic of duplication, I notice that the documentation of example datasets is duplicated as well. You documented them in each individual files (Koala.R
, Platypus.R
, etc.) and at the end of the main specieshindex.R
file. Could you have a look please?
I would like to emphasize that this insistence on avoiding code duplication is not just to annoy you. Code with high duplication is very hard / impossible to maintain in the long run. If you refactor the code to reduce duplication, future updates will require change in a single place, instead of throughout the codebase.
Thanks for the comments! I'll have another look at the updated code again and come up with a list of things to do and a timeline.
Hi @jesstytam, any idea about when we could complete this submission? :slightly_smiling_face:
Is end of Oct ok?
Is end of Oct ok?
In this case, I think it would be much better to withdraw until the required changes have been made and resubmit later, mentioning this issue.
This is much better for us as it would for example allow me to handle other papers in the mean time (I limit myself to 3-4 papers at once).
Ah ok then, withdrawing for now sounds like the better option. Thanks!
@openjournals/joss-eics, can we withdraw this submission please?
It is OK with me.
@editorialbot withdraw
Paper withdrawn.
Submitting author: !--author-handle-->@jesstytam<!--end-author-handle-- (Jessica Tam) Repository: https://github.com/jessicatytam/specieshindex Branch with paper.md (empty if default branch): Version: 0.3.1 Editor: !--editor-->@Bisaloo<!--end-editor-- Reviewers: @kamapu, @tom-gu, @DrMattG Archive: Pending
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@kamapu & @tom-gu & @DrMattG, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @Bisaloo know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Review checklist for @kamapu
✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @tom-gu
✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @DrMattG
✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper