ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

Submission: UCSCXenaTools #315

Closed ShixiangWang closed 5 years ago

ShixiangWang commented 5 years ago

Submitting Author: ShixiangWang (@ShixiangWang)
Repository: https://github.com/ShixiangWang/UCSCXenaTools
Version submitted: 1.2.3
Editor: @melvidoni Reviewer 1: @ChristineStawitz-NOAA Reviewer 2: @carlganz Archive: TBD
Version accepted: TBD


Package: UCSCXenaTools
Title: Download and Explore Datasets from UCSC Xena Data Hubs
Version: 1.2.3
Authors@R: c(person("Shixiang", "Wang", email = "w_shixiang@163.com", role = c("aut", "cre"), 
            comment = c(ORCID = "0000-0001-9855-7357")),
            person("Martin", "Morgan", role="aut"))
Maintainer: Shixiang Wang <w_shixiang@163.com>
URL: https://github.com/ShixiangWang/UCSCXenaTools
BugReports: https://github.com/ShixiangWang/UCSCXenaTools/issues
Description: Download and explore datasets from UCSC Xena data hubs, which are
    a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.
    Databases are normalized so they can be combined, linked, filtered, explored and downloaded.
Depends: R (>= 3.0)
Imports: 
    httr,
    readr,
    methods,
    utils,
    dplyr,
    magrittr,
    jsonlite,
    shinydashboard,
    shiny,
    DT
License: GPL-3
Encoding: UTF-8
LazyData: true
Suggests: 
    knitr,
    rmarkdown,
    prettydoc,
    testthat (>= 2.1.0),
    covr
VignetteBuilder: knitr
RoxygenNote: 6.1.1
Roxygen: list(markdown = TRUE)

Scope

The package falls under the data retrieval category because it pulls data from UCSC Xena Hubs.

None that I am aware of. There is a Python package called xenaPython which is the official API for Xena Hubs. However, UCSCXenaTools is more powerful and provides more features including download whole dataset and query a small amount of data.

https://github.com/ropensci/software-review/issues/314

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options - [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)
MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

melvidoni commented 5 years ago

Editor checks:


Editor comments

Thank you for submitting your package! Please add a LICENCE file to the repo, and fix the following suggestions of goodpractices::gp() package:

── GP UCSCXenaTools ──────────────────────────────────────────────────────────────────────────

It is good practice to

  ✖ not import packages as a whole, as this can cause name clashes
    between the imported packages. Instead, import only the specific functions you
    need.
────────────────────────────────────────────────────────────────────────────────────────────── 

Reviewers: @ChristineStawitz-NOAA and @carlganz Due date: July 22nd, 2019.

ShixiangWang commented 5 years ago

Dear editor @melvidoni,

All the comments you mentioned above have been fixed. A GPL-3 license file has been added by usethis::use_gpl3_license command. All import sentences in NAMESPACE file have been modified to importFrom, and corresponding code have also been modified.

Thanks for your comments. Please let me know if there are other issues.

Best, Shixiang

melvidoni commented 5 years ago

Thanks @ShixiangWang! I'm looking for reviewers, which may take a little. I will comment again when I assign them, and let you know about the due date.

melvidoni commented 5 years ago

Reviewers @ChristineStawitz-NOAA and @carlganz have been assigned to the package. They'll have until July 22nd to get the reviews in.

ShixiangWang commented 5 years ago

Thanks, @melvidoni. Welcome the reviewers👏 👏 👏

ChristineStawitz-NOAA commented 5 years ago

Apologies for getting this in at the last possible moment!

Package Review

Documentation

The package includes all the following forms of documentation:

  1. I was able to find the Vignettes on the author's website. However, they are not installed locally either when using devtools::install_github() or installing from CRAN. Even if there is a website version that is more up-to-date, it seems like a local version would be useful, particularly for CRAN users who might not find the Github link.
  2. I tried some of the intro vignette with the CRAN version and some with the Github one - with the CRAN version, XenaScan() is not found. There should be some guidance on minimum versions to run the vignette when describing installation.
  3. Intro vignette - For the examples using local files with XenaPrepare() it would be better to put some sample .gz files in the data/ directory and use rather than referencing E:/Github or ./downloads/ which all users won't have. I'm not sure if this is recommended by rOpenSci but in the past I've used system.file("data", package = "UCSCXenaTools") to access this directory.
  4. API vignette - typo in one of the headers: "Find out the number of idnetifiers in a dataset"
For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  1. Argument naming is not consistent - for example, the fetch_ API functions are in snake_case but a number of other functions are camel-cased with the first letter capitalized (i.e. XenaScan) or regular camel-cased (getTCGAdata). It would be helpful to adopt a similar casing style even if snake_case doesn't work because of consistency with other tools for these data sets.
  2. The Workflow.R and simplify.R files are a bit overloaded - style guidelines suggest not putting all of the key functions in a single file, and it wasn't clear to me where to search for some of the different functions.
  3. Throughout the R code, :: is used to reference functions within the UCSCXenaTools package itself as well as dplyr, which is imported. It could clean things up to remove these for packages that are explicitly imported.
  4. I think shiny and shinydashboard could be moved to suggests instead of imports in the DESCRIPTION

Review Comments

I'm not in the omics field, but this seems like a useful package for interacting with the UCSC Xena platform. I found the package reliable and stable, but it was a bit hard to follow where the vignettes were located after local install and there were a few vignette issues (see above). There are a couple of code simplifications and consistency improvements that could be made as well, which are detailed above.

melvidoni commented 5 years ago

Thank you @ChristineStawitz-NOAA! We are waiting for @carlganz's now, then.

carlganz commented 5 years ago

Promise to finish review tomorrow sorry for delay.

On Jul 22, 2019, at 4:35 PM, Melina Vidoni notifications@github.com wrote:

Thank you @ChristineStawitz-NOAA! We are waiting for @carlganz's now, then.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ShixiangWang commented 5 years ago

@ChristineStawitz-NOAA Thanks for your all comments. Basically, I fixed all the problems you raised. And I also have some reasons for the non-modified parts.

All replies for problematic points have been put as the following. Please forgive me my English if you find something hard to understand, just ask me again. :)

I get a warning when running devtools::check() - "qpdf is needed for size reduction on PDFs"

carlganz commented 5 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [X] A short summary describing the high-level functionality of the software
  • [X] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [X] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 4


Review Comments

I have very little knowledge of genomics, but I can see clearly that the UCSC Xena platform is valuable to the genomics community, and after thoroughly reviewing the code within your package I can see that working with the Xena platform is highly non-trivial, and that this package encapsulates a great deal of work that succesfully makes the Xena platform much more accessible to R users. Congrats on the wonderful package!

Statement of Need and JOSS summary

You explain what the Xena platform is, but it would be nice to have a sentence or two explaining why interfacing with the Xena platform is non-trivial, and how this interface significantly improves R user access.

Installation

Small code suggestions

So I would replace this:

  resDF <- data.frame(stringsAsFactors = FALSE)
  for (i in 1:length(XenaList)) {
    hostNames <- names(XenaList)[i]
    cohortNames <- names(XenaList[[i]])
    res <- data.frame(stringsAsFactors = FALSE)

    for (j in 1:length(cohortNames)) {
      oneCohort <- XenaList[[i]][j]
      # The unassigned cohorts have NULL data, remove it
      if (names(oneCohort) != "(unassigned)") {
        resCohort <- data.frame(
          XenaCohorts = names(oneCohort),
          XenaDatasets = as.character(oneCohort[[1]]),
          stringsAsFactors = FALSE
        )
        res <- rbind(res, resCohort)
      }
    }
    res$XenaHosts <- hostNames
    resDF <- rbind(resDF, res)
  }

With something like:

  # use lapply to create list of dfs because why introduce purrr dependency
  resDF <- lapply(seq_len(XenaList), function(i) {
    hostNames <- names(XenaList)[i]
    cohortNames <- names(XenaList[[i]])
    res <- data.frame(stringsAsFactors = FALSE)

    for (j in 1:length(cohortNames)) {
      oneCohort <- XenaList[[i]][j]
      # The unassigned cohorts have NULL data, remove it
      if (names(oneCohort) != "(unassigned)") {
        resCohort <- data.frame(
          XenaCohorts = names(oneCohort),
          XenaDatasets = as.character(oneCohort[[1]]),
          stringsAsFactors = FALSE
        )
        res <- rbind(res, resCohort)
      }
    }
    res$XenaHosts <- hostNames
    return(res)
  }) %>%
  # convert list of dfs into single df
  bind_rows

Hope that helps!

ShixiangWang commented 5 years ago

@ChristineStawitz-NOAA Sorry for ignoring There is no guidance on how to contribute yesterday, I have added it here.

@carlganz Thanks for your comments :). The modification can be seen from the last two commits. I also reply your comments point to point as the following.

Statement of Need and JOSS summary

It was a template in repository yesterday. I rewrite it today, you can see new version here. I am not good at English and also writing, so any suggestion is welcome :).

Installation

prettydoc package is in suggests field, so it will not be install automatically without option dependencies=TRUE. Thank you for poiniting it out, I have added this message in new README

use seq_len

I like 1:length(x) this format because it is more easy to read. seq_len is indeed a safer version for object with length 0. I have changed corresponding code.

use for

I won’t change such code for now. I do like for more than lapply here because for loop is much easier to read and debug.

For example,

  # use lapply to create list of dfs because why introduce purrr dependency
  resDF <- lapply(seq_len(XenaList), function(i) {
    hostNames <- names(XenaList)[i]
    cohortNames <- names(XenaList[[i]])
    res <- data.frame(stringsAsFactors = FALSE)
    ...

The code won’t work before modifying all XenaList to i and removing [i]. And I have to debug to make sure the code works.

When I developed this package, I have no idea how to use purrr, so this package is not used here.

As Hadley wrote in Advance in R, the for loop in R now is very quick. It will not affect the performance.

use bind_rows

This is modified.

ShixiangWang commented 5 years ago

I am glad to add two reviewers in DESCRIPTION file if you agree according to

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Thanks all your comments @ChristineStawitz-NOAA @carlganz

melvidoni commented 5 years ago

Great to see this is moving! Please @ChristineStawitz-NOAA and @carlganz, review the changes. Feel free to go back and forth if you deem it necessary.

ChristineStawitz-NOAA commented 5 years ago

Statement of Need and JOSS paper I don't see the bibliography showing up at the end of the paper.md file, though I can see the in-text references.

I initially did not review this as a JOSS submission, should I be also following their review criteria @melvidoni or will it go through a separate review at JOSS after the rOpenSci one?

I have some long-running code going in other RStudio sessions so I will not be able to reinstall the package and check the vignette again until tomorrow - assuming that works I will sign off on the review then as I am satisfied with the changes from what is in this thread and the repo :) Nice work @ShixiangWang ! I am also OK with being credited for the review, thanks.

carlganz commented 5 years ago

I am happy with the updates. I've updated review to give full approval. Congrats!

ShixiangWang commented 5 years ago

@ChristineStawitz-NOAA the bibliography is shown in paper.bib file according to JOSS instruction.

@ChristineStawitz-NOAA @carlganz I have added your names here and here, please take a look and make sure they are right.

ChristineStawitz-NOAA commented 5 years ago

Looks good and I am able to successfully run everything in the vignette after the changes - thanks!

On Thu, Jul 25, 2019 at 7:27 PM Shixiang Wang notifications@github.com wrote:

@ChristineStawitz-NOAA https://github.com/ChristineStawitz-NOAA the bibliography is shown in paper.bib https://github.com/ShixiangWang/UCSCXenaTools/blob/master/paper/paper.bib file according to JOSS instruction https://joss.readthedocs.io/en/latest/submitting.html#example-paper-and-bibliography .

@ChristineStawitz-NOAA https://github.com/ChristineStawitz-NOAA @carlganz https://github.com/carlganz I have added your names here https://github.com/ShixiangWang/UCSCXenaTools/blob/66eba83777e61b5f1a6f2b375b2d2c598cfb8bbf/DESCRIPTION#L8 and here https://github.com/ShixiangWang/UCSCXenaTools/blob/master/paper/paper.md#acknowledgements, please take a look and make sure they are right.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/software-review/issues/315?email_source=notifications&email_token=ALNPO3IRMOBVUGMBY4Z7EATQBJOK5A5CNFSM4HY425XKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD23J2MQ#issuecomment-515284274, or mute the thread https://github.com/notifications/unsubscribe-auth/ALNPO3PXLDLGYB2NKWHX2UDQBJOK5ANCNFSM4HY425XA .

melvidoni commented 5 years ago

Approved! Thanks @ShixiangWang for submitting and @ChristineStawitz-NOAA @carlganz for your reviews!

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a blog post about your package - either a short introduction to it with one example or a longer post with some narrative about its development or something you learned, and an example of its use. If you are interested, review the instructions, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

ShixiangWang commented 5 years ago

@melvidoni I got a message when I transferred the repository

You don’t have the permission to create repositories on ropensci

melvidoni commented 5 years ago

@ShixiangWang I sent you an invitation to be part of rOpenSci organisation in GItHub. If you accept it, you should be able to transfer the repo. (Transfer, do not create!).

ShixiangWang commented 5 years ago

@melvidoni Thanks, it works. There is a problem when I do this step Activate Zenodo watching the repo, I can't find my package in the Github list

image

It seems I should have rights to access the repositories of ropensci.

image

I don't know where the problem comes from, the Zenodo (I just opened an issue) or the setting from ropensci? Please help me .

I also found I am in a team auk, please remove me from it.

image

melvidoni commented 5 years ago

I removed you from that team @ShixiangWang, and added you to the package team. Can you accept that? (it shows as pending). Also, can you try adding Zenodo now?

ShixiangWang commented 5 years ago

Strange, I was already in the package team yesterday, so why is it showing pending?. I will try Zenodo as soon as possible.

ShixiangWang commented 5 years ago

@melvidoni Thanks, I can find the package in Zenodo now. :)

image

melvidoni commented 5 years ago

Awesome! Then, if you can tackle everything and then let me know, that would be great.

ShixiangWang commented 5 years ago

@melvidoni I have done everything in your todo list.

I have submitted URL https://doi.org/10.5281/zenodo.3358530 to JOSS.

If the DOI not found, please see https://zenodo.org/record/3358530

@stefaniebutland I would like to write a short introduction to this package.

Thanks all. Shixiang

stefaniebutland commented 5 years ago

Good to hear that @ShixiangWang

@melvidoni is there an element of the package, it's development, or applications that you think could be highlighted?

For a short introduction, a Tech Note is most appropriate. Examples. It's important not to repeat information that's in the vignette; refer to it where needed. You might illustrate a compelling use case for the package.

Editorial and technical information about preparing your submission, including a template: https://github.com/ropensci/roweb2#contributing-a-blog-post. In the template YAML, for a Tech Note, set categories: technotes

Please suggest a deadline for submitting your draft. Don't hesitate to ask me questions. I'll be away Mon Aug 5.

Thank you for doing this extra work. I hope the post will get more eyes on your work.

ShixiangWang commented 5 years ago

@stefaniebutland Thanks, can I submit the draft within a month?

@melvidoni It seems that I should submit repository URL instead of Zenodo DOI to JOSS, please see https://github.com/openjournals/joss-reviews/issues/1622#issuecomment-517802770

melvidoni commented 5 years ago

@ShixiangWang That is weird! If they need URL repo, then that's it. Send the repo. I will advise editors to fix this on the approval comment template as well!

ShixiangWang commented 5 years ago

@melvidoni Yes 😭. Is there anything more I should do?

melvidoni commented 5 years ago

Not for now! Just send them the URL repo then.

ShixiangWang commented 5 years ago

@melvidoni The package has published on JOSS https://joss.theoj.org/papers/10.21105/joss.01627. Thanks for your work, the process is really fast.

melvidoni commented 5 years ago

@ShixiangWang That is great! If you don't mind, I'm tweeting your package, and closing the thread. Congrats and thank you for all the hard work!

stefaniebutland commented 5 years ago

can I submit the draft within a month?

Yes, sounds good. I'll mark my calendar to check with you on Aug 27. Thank you.