arhofman commented 6 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/cbg-ethz/GeneAccord

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Git (optionally via GitHub).

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 6 years ago

Hi @arhofman

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: GeneAccord
Type: Package
Title: Detection of clonally exclusive gene or pathway pairs in a cohort of cancer patients
Version: 0.99.0
Author: Ariane L. Moore, Jack Kuipers and Niko Beerenwinkel
Maintainer: Ariane L. Moore <ariane.moore@bsse.ethz.ch>
Description: A statistical framework to examine the combinations of clones that co-exist in tumors. More precisely, the algorithm finds pairs of genes that are mutated in the same tumor but in different clones, i.e. their subclonal mutation profiles are mutually exclusive. We refer to this as clonally exclusive. It means that the mutations occurred in different branches of the tumor phylogeny, indicating parallel evolution of the clones. Our statistical framework assesses whether a pattern of clonal exclusivity occurs more often than expected by chance alone across a cohort of patients. The required input data are the mutated gene-to-clone assignments from a cohort of cancer patients, which were obtained by running phylogenetic tree inference methods. Reconstructing the evolutionary history of a tumor and detecting the clones is challenging. For nondeterministic algorithms, repeated tree inference runs may lead to slightly different mutation-to-clone assignments. Therefore, our algorithm was designed to allow the input of multiple gene-to-clone assignments per patient. They may have been generated by repeatedly performing the tree inference, or by sampling from the posterior distribution of trees. The tree inference methods designate the mutations to individual clones. The mutations can then be mapped to genes or pathways. Hence our statistical framework can be applied on the gene level, or on the pathway level to detect clonally exclusive pairs of pathways. If a pair is significantly clonally exclusive, it points towards the fact that this specific clone configuration confers a selective advantage, possibly through synergies between the clones with these mutations.
Depends: R (>= 3.4)
Suggests: assertthat, 
    devtools, 
    knitr, 
    rmarkdown,
    testthat
Imports: biomaRt, caTools, dplyr, ggplot2, graphics, grDevices, gtools, ggpubr, magrittr, maxLik, RColorBrewer, reshape2, stats, tibble, utils 
biocViews: BiomedicalInformatics, GeneticVariability, GenomicVariation, SomaticMutation, FunctionalGenomics, Genetics, MathematicalBiology, SystemsBiology, FeatureExtraction, PatternLogic, Pathways
License: file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
VignetteBuilder: knitr
URL: https://github.com/cbg-ethz/GeneAccord

Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.

bioc-issue-bot commented 6 years ago

A reviewer has been assigned to your package Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

e7f89ec [INTERNAL] Updated the version number 0.99.0 --> 0...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

a37f91c [INTERNAL] Added the bioRxiv number and doi

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

d679548 [FEATURE] Added the ensembl gene id to reactome pa... 1b8bec8 [INTERNAL] updated the GeneAccord version number d...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

9165d6e [FIX] replaced non-ASCII strings 2c88983 [INTERNAL] updated version number

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

16e0383 [FEATURE] replaced the string 'ENSG' in the sanity...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

c97f9aa [INTERNAL] data set 'ensmusg_reactome_path_map' ...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

497d347 [FEATURE] added the Rd file for the documentation ... bb9d9b6 [INTERNAL] updated the version number

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

nturaga commented 6 years ago

Hi @arhofman

Please correct all the issues in your build report before we proceed and I take a look at your package and review it.

[REQUIRED]

* Checking coding practice...
     * NOTE: Avoid sapply(); use vapply() found in files:
       data_handling_functions.R (line 116, column 19)
       data_handling_functions.R (line 608, column 21)
       data_handling_functions.R (line 689, column 17)
       data_handling_functions.R (line 693, column 27)
       data_handling_functions.R (line 777, column 26)
       databases_handling_functions.R (line 249, column 25)
       sim_functions.R (line 365, column 25)
       stats_functions.R (line 263, column 32)
     * NOTE: Avoid 1:...; use seq_len() or seq_along() found in files:
       databases_handling_functions.R (line 113, column 17)
       plot_functions.R (line 461, column 29)
     * WARNING: Use TRUE/FALSE instead of T/F
       Found in files:
         inst/extdata/prepare_patient_data_all.R

 * Checking for bioc-devel mailing list subscription...
     * NOTE: Cannot determine whether maintainer is subscribed to the
       bioc-devel mailing list (requires admin credentials).  Subscribe
       here: https://stat.ethz.ch/mailman/listinfo/bioc-devel

 * NOTE: Consider shorter lines; 1185 lines (23%) are > 80
      characters long.
    First 6 lines:
      R/data_handling_functions.R:1 #' Creates a tibble containing the inform...
      R/data_handling_functions.R:3 #' It expects a comma-separated table whe...
      R/data_handling_functions.R:4 #' or pathway. The other columns are for ...
      R/data_handling_functions.R:7 #' The table is expected to be comma-sepa...
      R/data_handling_functions.R:8 #' ..., 'cloneN', depending on how many c...
      R/data_handling_functions.R:9 #' in the first column the name of the mu...
    * NOTE: Consider multiples of 4 spaces for line indents, 1578
      lines(30%) are not.
    First 6 lines:
      R/data_handling_functions.R:28   stopifnot(is.character(path_to_file))
      R/data_handling_functions.R:29   stopifnot(is.numeric(max_num_clones))
      R/data_handling_functions.R:30   stopifnot(file.exists(path_to_file))
      R/data_handling_functions.R:32   message(paste("Found the file ", path_...
      R/data_handling_functions.R:34   # create tibble
      R/data_handling_functions.R:35   fn <- path_to_file

See http://bioconductor.org/developers/how-to/coding-style/

Your summary says you have a lot of issues in your build report please fix them. There should be no warnings, and as few notes as possible.

Summary:
 ERROR count: 0
 WARNING count: 1
 NOTE count: 6

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

c35e5c6 [INTERNAL] changed sapply() to vapply() or lapply(...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

arhofman commented 6 years ago

Hi @nturaga

Thank you for your message. I tried to address all notes and warnings. Specifically, I implemented the following changes:

sapply to vapply or lapply
1:... to seq(1, ...)
T/F to TRUE/FALSE
shortened the lines and tried to be < 80 characters
have indents of multiples of four spaces
I am subscribed to the bioc-devel mailing list

Thanks a lot for reviewing my this package! Best regards, @arhofman

nturaga commented 6 years ago

R CMD build

ok

R CMD INSTALL

ok

DESCRIPTION

ok

NAMESPACE

ok

R

There is no need to call paste0 when you are trying to write a message or a warning. Both message and warning will automatically concatenate your string with a variable. This has been done in multiple places, and needs to be corrected.

Using paste0 or paste is only a good idea when the variable is a vector.
Remove commented out code.
use seq_len instead of seq in data_handling_functions.R, L56.
```
for(i in seq_len(num_clones))
```
instead of
```
for(i in seq(1, num_clones))
```
Do this everywhere seq is being used. You can also take a look at seq_along wherever appropriate.
The code chunk from L770 in data_handling_functions.R, which sets the variable, merged_ents_clone_tbl_list needs to be simplified. You use for loops within an lapply making it a nested for loop. There are operations in this set that can be vectorized.

This nesting is done again in databases_handling_functions.R L319 onwards.

Pleae fix similar issues throughout your code.

Vignette

Please consider using the BiocStyle formatting for your vignette by checking out the package BiocStyle on Bioconductor. It adds formatting to make your vignette more easily readable.
In the vignette you must also include installation instructions instead of pointing to README, as that won't be available via the Bioconductor website.
The vignette takes a while to run (especially the code in sim_functions.R.)
There is no need to include GeneAccord.R or GeneAccord.html in the vignettes folder. When the package is being built, the system will run R CMD Stangle GeneAccord.Rmd to generate the code from the vignette.

After your changes in the code, please let me know. I will review it again. As always, please bump the version number and make sure the build is as clean as possible.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

222739a [FEATURE] removed paste0 when using message, warni...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

3332c04 [FIX] removed the CRANpkg and Biocpkg to avoid bui...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

94656bc [FEATURE] shortened some of the lines; put more of...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

3c71872 [FEATURE] shortened some of the lines; made more l...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

arhofman commented 6 years ago

Hi @nturaga Thank you for your comments. I addressed all of them and implemented the following changes:

removed the paste0 from warnings, messages, and stops
removed code that was commented out
used seq_len or seq_along instead of seq where possible
simplified code such that there are no for loops within an apply
tried to use the BiocStyle formatting for the vignette
included installation instructions in the vignette
adjusted the vignette such that it runs faster
removed the GeneAccord.R and GeneAccord.html files from the vignettes folder
shortened the lines as good as possible and tried to be < 80 characters
tried to have indents of multiples of four spaces

Thanks a lot for reviewing this package! Best regards, @arhofman

nturaga commented 6 years ago

Review

A general comment is you should just use roxygen2 if you are commenting your code using the roxygen2 documentation. You should let the namespace be generated by that as well.
If you include a NEWS file make sure it's parsable by utils::news()

`GeneAccord_main_functions.R`

You can try to use tidyverse operations for data manipulation for your tibble,
```
all_pairs_all_pats <- do.call("rbind", list_all_pairs_all_pats)
```
check functions like bind_rows (bind_cols), and map , reduce to do the same things.

The call to as.tbl(data.frame(*)) is not needed if you can just call tibble. Single function calls make reading the code easier, and have a potential to make your code run faster as well.

res_all_pairs <- dplyr::as.tbl(data.frame(entity_A=pairs_to_test[,1], 
    entity_B=pairs_to_test[,2],
    num_patients=all_pairs_num_pat,
    pval=all_pairs_pval,
    mle_delta=these_deltas,
    test_statistic=these_test_stats))

vs

    res_all_pairs <- dplyr::tibble(entity_A=pairs_to_test[,1], 
        entity_B=pairs_to_test[,2],
        num_patients=all_pairs_num_pat,
        pval=all_pairs_pval,
        mle_delta=these_deltas,
        test_statistic=these_test_stats)

Just as.numeric will do the same thing. You can avoid the multiple casting calls in this line (Do this everywhere where you have calls like this, make sure to check if all these calls are needed).
```
all_pvals <- 
    as.numeric(as.vector(as.data.frame(final_res_pairs %>% 
    dplyr::select(pval))[,1]))
```
vs
```
    all_pvals <- as.numeric(final_res_pairs %>% dplyr::select(pval))
```

`data_handling_dunctions.R`

Creating duplicate variable names which are confusing and not needed
```
fn <- path_to_file
```
You can just use path_to_file.
You may also consider importing some dplyr functions (try @importFrom in roxygen2) like select, mutate etc which you are using commonly instead of calling dplyr::select etc each time. Makes easier code reading.

`plot_functions.R`

You don't need call a ggplot2 function like shown below. This is incredibly unhelpful when reviewing code.

this_plot <- ggplot2::ggplot(rates_and_clones_tbl, 
    ggplot2::aes(x=pat_ids_r, 
    y=rates_m, fill=num_c)) +
ggplot2::geom_bar(stat="identity") +
ggplot2::ggtitle("Mean rates of clonal exclusivity in each patient")+
ggplot2::ylab("Rate m") +
ggplot2::xlab("Patients") +
ggplot2::scale_fill_gradient(low="lightblue", high="darkblue", 
    guide="colourbar") +
ggplot2::guides(fill=ggplot2::guide_colourbar(title=paste0("Average",
    "number of clones"))) +
ggplot2::coord_flip() + 
ggplot2::theme_gray()

vs

this_plot <- ggplot(rates_and_clones_tbl, 
                    aes(x=pat_ids_r, 
                        y=rates_m, fill=num_c)) +
    geom_bar(stat="identity") +
    ggtitle("Mean rates of clonal exclusivity in each patient")+
    ylab("Rate m") +
    xlab("Patients") +
    scale_fill_gradient(low="lightblue", high="darkblue", 
    guide="colourbar") +
    guides(fill=guide_colourbar(title=paste0("Average",
    "number of clones"))) +
    coord_flip() + 
    theme_gray()

Import what is needed in the namespace and use the functions. This is a recurring theme throught the codebase and can be easily fixed by using roxygen2 correctly. If needed please go over the roxygen2 documentation.

nturaga commented 6 years ago

Hi @arhofman

I'm giving you a 2 week notice to improve your package or reply with some information regarding your progress. We can always close the issue and reopen when needed and you are ready.

Best,

Nitesh

arhofman commented 6 years ago

Hi @nturaga

Thanks for your comments and your notice. I was busy with another project where we had a deadline coming up, but will now have time within the next two weeks to work on your suggestions. I hope this is fine,

Bests, Ariane

nturaga commented 6 years ago

That's great. It's good to know you'll be working on this again.

The notice just serves as a reminder for a progress update or activity of some sort. Sometimes developers have other things on their plate, so we don't keep the issue open more than a certain number of days if there has been no progress.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

4b9c7a7 [FEATURE] Improved code to use more tidyverse and ...

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

arhofman commented 6 years ago

Hi @nturaga

Thanks for your comments to improve my package! I addressed all of them. Specifically I implemented the following changes:

I now use roxygen2 to create the Namespace file, and also import explicitly all functions from other packages that I need in my functions to avoid unnecessary use of “::”
Changed the News file such that it is hopefully parsable by utils::news()
Changed do.call(“rbind”,…) into bind_rows(…), and also changed cbind(data.frame(…)) into bind_cols(…) in a few places
Changed as.tbl(data.frame(…)) to just tibble(…)
Changed the way to extract a single column from a tibble from as.numeric(as.vector(as.data.frame(…))) to pull
Removed unnecessary variable fn -> path_to_file Thanks a lot for reviewing this package!

Best regards, @arhofman

nturaga commented 6 years ago

Thanks for making the changes. It looks good.

Could you please issue a version bump on this so that it builds on all three platforms? This might have been an issue on our end (maybe @lshep has an idea?)

Thanks,

Nitesh

lshep commented 6 years ago

We were running updates which is why the malbec1 is not displayed. I'll kick off a manual build now to have all three platforms in the report. Sorry about that.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

nturaga commented 6 years ago

Hi @arhofman,

Your package seems fine after the new build report. You might want to consider using MultiAssayExperiment (https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html) as an input object to your functions. This will help the users who are using Bioconductor objects to directly use your package instead of passing in a tibble. You may have to change the structure a little bit.

I will accept the package since it looks good otherwise. But, you should really consider taking in a Bioconductor(MultiAssayExperiment) object as well as a tibble if your users need it.

Best,

Nitesh

bioc-issue-bot commented 6 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.

Thank you for contributing to Bioconductor!

LiNk-NY commented 6 years ago

Hi @arhofman,

I second Nitesh @nturaga and suggest that you use existing Bioconductor infrastructure such as MultiAssayExperiment for your package. The MultiAssayExperiment class uses an ExperimentList, which is a List-derived structure that allows you to include several assays from different flat rectangular data types. It also allows you to subset to specific set of patients via the colData structure. Refer to the MultiAssayExperiment vignette for more details.

PS. Please update the Installation section in the package vignette. Best regards, Marcel

arhofman commented 6 years ago

Hi @nturaga and @LiNk-NY,

Thank you for your comments. That sounds very good. I am busy now finishing another deadline, but will address your suggestions in 2-3 weeks.

Bests, Ariane

mtmorgan commented 6 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/arhofman.keys is not empty), then no further steps are required. Otherwise, do the following:

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite(\"GeneAccord\"). The package 'landing page' will be created at

https://bioconductor.org/packages/GeneAccord

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

Bioconductor / Contributions

GeneAccord #730

R CMD build

R CMD INSTALL

DESCRIPTION

NAMESPACE

R

Vignette

Review

`GeneAccord_main_functions.R`

`data_handling_dunctions.R`

`plot_functions.R`