Closed arhofman closed 6 years ago
Hi @arhofman
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: GeneAccord
Type: Package
Title: Detection of clonally exclusive gene or pathway pairs in a cohort of cancer patients
Version: 0.99.0
Author: Ariane L. Moore, Jack Kuipers and Niko Beerenwinkel
Maintainer: Ariane L. Moore <ariane.moore@bsse.ethz.ch>
Description: A statistical framework to examine the combinations of clones that co-exist in tumors. More precisely, the algorithm finds pairs of genes that are mutated in the same tumor but in different clones, i.e. their subclonal mutation profiles are mutually exclusive. We refer to this as clonally exclusive. It means that the mutations occurred in different branches of the tumor phylogeny, indicating parallel evolution of the clones. Our statistical framework assesses whether a pattern of clonal exclusivity occurs more often than expected by chance alone across a cohort of patients. The required input data are the mutated gene-to-clone assignments from a cohort of cancer patients, which were obtained by running phylogenetic tree inference methods. Reconstructing the evolutionary history of a tumor and detecting the clones is challenging. For nondeterministic algorithms, repeated tree inference runs may lead to slightly different mutation-to-clone assignments. Therefore, our algorithm was designed to allow the input of multiple gene-to-clone assignments per patient. They may have been generated by repeatedly performing the tree inference, or by sampling from the posterior distribution of trees. The tree inference methods designate the mutations to individual clones. The mutations can then be mapped to genes or pathways. Hence our statistical framework can be applied on the gene level, or on the pathway level to detect clonally exclusive pairs of pathways. If a pair is significantly clonally exclusive, it points towards the fact that this specific clone configuration confers a selective advantage, possibly through synergies between the clones with these mutations.
Depends: R (>= 3.4)
Suggests: assertthat,
devtools,
knitr,
rmarkdown,
testthat
Imports: biomaRt, caTools, dplyr, ggplot2, graphics, grDevices, gtools, ggpubr, magrittr, maxLik, RColorBrewer, reshape2, stats, tibble, utils
biocViews: BiomedicalInformatics, GeneticVariability, GenomicVariation, SomaticMutation, FunctionalGenomics, Genetics, MathematicalBiology, SystemsBiology, FeatureExtraction, PatternLogic, Pathways
License: file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
VignetteBuilder: knitr
URL: https://github.com/cbg-ethz/GeneAccord
Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.
A reviewer has been assigned to your package Learn what to expect during the review process.
IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
e7f89ec [INTERNAL] Updated the version number 0.99.0 --> 0...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
a37f91c [INTERNAL] Added the bioRxiv number and doi
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
16e0383 [FEATURE] replaced the string 'ENSG' in the sanity...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
c97f9aa [INTERNAL] data set 'ensmusg_reactome_path_map' ...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi @arhofman
Please correct all the issues in your build report before we proceed and I take a look at your package and review it.
[REQUIRED]
* Checking coding practice...
* NOTE: Avoid sapply(); use vapply() found in files:
data_handling_functions.R (line 116, column 19)
data_handling_functions.R (line 608, column 21)
data_handling_functions.R (line 689, column 17)
data_handling_functions.R (line 693, column 27)
data_handling_functions.R (line 777, column 26)
databases_handling_functions.R (line 249, column 25)
sim_functions.R (line 365, column 25)
stats_functions.R (line 263, column 32)
* NOTE: Avoid 1:...; use seq_len() or seq_along() found in files:
databases_handling_functions.R (line 113, column 17)
plot_functions.R (line 461, column 29)
* WARNING: Use TRUE/FALSE instead of T/F
Found in files:
inst/extdata/prepare_patient_data_all.R
* Checking for bioc-devel mailing list subscription...
* NOTE: Cannot determine whether maintainer is subscribed to the
bioc-devel mailing list (requires admin credentials). Subscribe
here: https://stat.ethz.ch/mailman/listinfo/bioc-devel
* NOTE: Consider shorter lines; 1185 lines (23%) are > 80
characters long.
First 6 lines:
R/data_handling_functions.R:1 #' Creates a tibble containing the inform...
R/data_handling_functions.R:3 #' It expects a comma-separated table whe...
R/data_handling_functions.R:4 #' or pathway. The other columns are for ...
R/data_handling_functions.R:7 #' The table is expected to be comma-sepa...
R/data_handling_functions.R:8 #' ..., 'cloneN', depending on how many c...
R/data_handling_functions.R:9 #' in the first column the name of the mu...
* NOTE: Consider multiples of 4 spaces for line indents, 1578
lines(30%) are not.
First 6 lines:
R/data_handling_functions.R:28 stopifnot(is.character(path_to_file))
R/data_handling_functions.R:29 stopifnot(is.numeric(max_num_clones))
R/data_handling_functions.R:30 stopifnot(file.exists(path_to_file))
R/data_handling_functions.R:32 message(paste("Found the file ", path_...
R/data_handling_functions.R:34 # create tibble
R/data_handling_functions.R:35 fn <- path_to_file
See http://bioconductor.org/developers/how-to/coding-style/
Your summary says you have a lot of issues in your build report please fix them. There should be no warnings, and as few notes as possible.
Summary:
ERROR count: 0
WARNING count: 1
NOTE count: 6
Received a valid push; starting a build. Commits are:
c35e5c6 [INTERNAL] changed sapply() to vapply() or lapply(...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi @nturaga
Thank you for your message. I tried to address all notes and warnings. Specifically, I implemented the following changes:
Thanks a lot for reviewing my this package! Best regards, @arhofman
ok
ok
ok
ok
There is no need to call paste0
when you are trying to write a
message
or a warning
. Both message
and warning
will
automatically concatenate your string with a variable. This has been
done in multiple places, and needs to be corrected.
Using paste0
or paste
is only a good idea when the variable is a
vector.
Remove commented out code.
use seq_len
instead of seq
in data_handling_functions.R, L56.
for(i in seq_len(num_clones))
instead of
for(i in seq(1, num_clones))
Do this everywhere seq
is being used. You can also take a look
at seq_along
wherever appropriate.
The code chunk from L770 in data_handling_functions.R, which sets
the variable, merged_ents_clone_tbl_list
needs to be
simplified. You use for loops within an lapply
making it a nested
for loop. There are operations in this set that can be vectorized.
This nesting is done again in databases_handling_functions.R L319 onwards.
Pleae fix similar issues throughout your code.
Please consider using the BiocStyle formatting for your vignette by
checking out the package BiocStyle
on Bioconductor. It adds
formatting to make your vignette more easily readable.
In the vignette you must also include installation instructions instead of pointing to README, as that won't be available via the Bioconductor website.
The vignette takes a while to run (especially the code in sim_functions.R
.)
There is no need to include GeneAccord.R
or GeneAccord.html
in
the vignettes folder. When the package is being built, the system
will run R CMD Stangle GeneAccord.Rmd
to generate the code from
the vignette.
After your changes in the code, please let me know. I will review it again. As always, please bump the version number and make sure the build is as clean as possible.
Received a valid push; starting a build. Commits are:
222739a [FEATURE] removed paste0 when using message, warni...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
3332c04 [FIX] removed the CRANpkg and Biocpkg to avoid bui...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
94656bc [FEATURE] shortened some of the lines; put more of...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
3c71872 [FEATURE] shortened some of the lines; made more l...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi @nturaga Thank you for your comments. I addressed all of them and implemented the following changes:
Thanks a lot for reviewing this package! Best regards, @arhofman
A general comment is you should just use roxygen2 if you are commenting your code using the roxygen2 documentation. You should let the namespace be generated by that as well.
If you include a NEWS file make sure it's parsable by utils::news()
GeneAccord_main_functions.R
You can try to use tidyverse operations for data manipulation for your tibble,
all_pairs_all_pats <- do.call("rbind", list_all_pairs_all_pats)
check functions like bind_rows
(bind_cols
), and map
, reduce
to do the same things.
The call to as.tbl(data.frame(*))
is not needed if you can just
call tibble
. Single function calls make reading the code easier,
and have a potential to make your code run faster as well.
res_all_pairs <- dplyr::as.tbl(data.frame(entity_A=pairs_to_test[,1],
entity_B=pairs_to_test[,2],
num_patients=all_pairs_num_pat,
pval=all_pairs_pval,
mle_delta=these_deltas,
test_statistic=these_test_stats))
vs
res_all_pairs <- dplyr::tibble(entity_A=pairs_to_test[,1],
entity_B=pairs_to_test[,2],
num_patients=all_pairs_num_pat,
pval=all_pairs_pval,
mle_delta=these_deltas,
test_statistic=these_test_stats)
Just as.numeric
will do the same thing. You can avoid the multiple
casting calls in this line (Do this everywhere where you have calls
like this, make sure to check if all these calls are needed).
all_pvals <-
as.numeric(as.vector(as.data.frame(final_res_pairs %>%
dplyr::select(pval))[,1]))
vs
all_pvals <- as.numeric(final_res_pairs %>% dplyr::select(pval))
data_handling_dunctions.R
Creating duplicate variable names which are confusing and not needed
fn <- path_to_file
You can just use path_to_file
.
You may also consider importing some dplyr functions (try
@importFrom
in roxygen2) like select
, mutate
etc which you are
using commonly instead of calling dplyr::select
etc each
time. Makes easier code reading.
plot_functions.R
You don't need call a ggplot2 function like shown below. This is incredibly unhelpful when reviewing code.
this_plot <- ggplot2::ggplot(rates_and_clones_tbl,
ggplot2::aes(x=pat_ids_r,
y=rates_m, fill=num_c)) +
ggplot2::geom_bar(stat="identity") +
ggplot2::ggtitle("Mean rates of clonal exclusivity in each patient")+
ggplot2::ylab("Rate m") +
ggplot2::xlab("Patients") +
ggplot2::scale_fill_gradient(low="lightblue", high="darkblue",
guide="colourbar") +
ggplot2::guides(fill=ggplot2::guide_colourbar(title=paste0("Average",
"number of clones"))) +
ggplot2::coord_flip() +
ggplot2::theme_gray()
vs
this_plot <- ggplot(rates_and_clones_tbl,
aes(x=pat_ids_r,
y=rates_m, fill=num_c)) +
geom_bar(stat="identity") +
ggtitle("Mean rates of clonal exclusivity in each patient")+
ylab("Rate m") +
xlab("Patients") +
scale_fill_gradient(low="lightblue", high="darkblue",
guide="colourbar") +
guides(fill=guide_colourbar(title=paste0("Average",
"number of clones"))) +
coord_flip() +
theme_gray()
Import what is needed in the namespace and use the functions. This is a recurring theme throught the codebase and can be easily fixed by using roxygen2 correctly. If needed please go over the roxygen2 documentation.
Hi @arhofman
I'm giving you a 2 week notice to improve your package or reply with some information regarding your progress. We can always close the issue and reopen when needed and you are ready.
Best,
Nitesh
Hi @nturaga
Thanks for your comments and your notice. I was busy with another project where we had a deadline coming up, but will now have time within the next two weeks to work on your suggestions. I hope this is fine,
Bests, Ariane
That's great. It's good to know you'll be working on this again.
The notice just serves as a reminder for a progress update or activity of some sort. Sometimes developers have other things on their plate, so we don't keep the issue open more than a certain number of days if there has been no progress.
Received a valid push; starting a build. Commits are:
4b9c7a7 [FEATURE] Improved code to use more tidyverse and ...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi @nturaga
Thanks for your comments to improve my package! I addressed all of them. Specifically I implemented the following changes:
Best regards, @arhofman
Thanks for making the changes. It looks good.
Could you please issue a version bump on this so that it builds on all three platforms? This might have been an issue on our end (maybe @lshep has an idea?)
Thanks,
Nitesh
We were running updates which is why the malbec1 is not displayed. I'll kick off a manual build now to have all three platforms in the report. Sorry about that.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi @arhofman,
Your package seems fine after the new build report. You might want to consider using MultiAssayExperiment
(https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html) as an input object to your functions. This will help the users who are using Bioconductor objects to directly use your package instead of passing in a tibble
. You may have to change the structure a little bit.
I will accept the package since it looks good otherwise. But, you should really consider taking in a Bioconductor(MultiAssayExperiment
) object as well as a tibble
if your users need it.
Best,
Nitesh
Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.
Thank you for contributing to Bioconductor!
Hi @arhofman,
I second Nitesh @nturaga and suggest that you use existing Bioconductor infrastructure
such as MultiAssayExperiment
for your package.
The MultiAssayExperiment
class uses an ExperimentList
, which is a
List
-derived structure that allows you to include several assays from different
flat rectangular data types.
It also allows you to subset to specific set of patients via the colData
structure.
Refer to the MultiAssayExperiment vignette for more details.
PS. Please update the Installation
section in the package vignette.
Best regards,
Marcel
Hi @nturaga and @LiNk-NY,
Thank you for your comments. That sounds very good. I am busy now finishing another deadline, but will address your suggestions in 2-3 weeks.
Bests, Ariane
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/arhofman.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite(\"GeneAccord\")
. The package 'landing page' will be created at
https://bioconductor.org/packages/GeneAccord
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.