Closed travis-m-blimkie closed 10 months ago
Hi @travis-m-blimkie
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: pathlinkR
Type: Package
Title: Analyze and interpret RNA-Seq results
Version: 0.99.360
Authors@R: c(
person("Travis", "Blimkie", , "travis.m.blimkie@gmail.com", role = c("cre"), comment = c(ORCID = "0000-0001-8778-8627")),
person("Andy", "An", , "andy@hancocklab.com", role = c("aut"))
)
Description: pathlinkR is an R package designed to facilitate analysis of
RNA-Seq results. Specifically, our aim with pathlinkR was to provide a
number of tools which take a list of DE genes and perform different analyses
on them, aiding with the interpretation of results. Functions are included
to perform pathway enrichment, with muliplte databases supported, and tools
for visualizing these results. Genes can also be used to create and plot
protein-protein interaction networks, all from inside of R.
biocViews: GeneSetEnrichment, Network, Pathways, Reactome, RNASeq,
NetworkEnrichment
BiocType: Software
BugReports: https://github.com/hancockinformatics/pathlinkR
License: GPL-3 + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Imports:
circlize,
clusterProfiler,
ComplexHeatmap,
dplyr,
ggforce,
ggplot2,
ggpubr,
ggraph,
ggrepel,
grid,
igraph,
purrr,
sigora,
SteinerNet,
stringr,
tibble,
tidygraph,
tidyr,
vegan,
visNetwork
Depends:
R (>= 4.2.0)
LazyDataCompression: xz
LazyData: true
Suggests:
AnnotationDbi,
BiocStyle,
biomaRt,
covr,
jsonlite,
knitr,
org.Hs.eg.db,
rmarkdown,
scales,
testthat (>= 3.0.0),
vdiffr
VignetteBuilder: knitr
Config/testthat/edition: 3
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Build System.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pathlinkR_0.99.360.tar.gz macOS 12.6.5 Monterey: pathlinkR_0.99.360.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pathlinkR
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hello,
I have worked on the package and made the changes need to fix the error identified in the build reports above, pushing everything to the Github "main" branch throughout. Because the issue involved the package being too large, I followed the instructions provided here and on the BFG page to clean the git history and get the package size below 5MB. However, I was unable to push these changes to the Bioconductor git repository (via these instructions). I tried pulling/merging from "upstream", and fixing the resulting conflicts, but now it says I'm ahead of "origin/main" by 133 commits, and I'm worried a push to Github may mess up the fixes I've done.
What's the recommended solution here? Is it possible to have the Bioconductor version of pathlinkR be updated from the current Github version? Maybe a fresh clone of the repository? That way all the changes I've made on Github won't be overwritten.
I'm new to this process, so if there's something I'm missing please let me know.
Cheers!
Yes we don't allow force pushes to the git.bicoonductor.org repo but we can reset it on our side from the github. I will try to get to this later today.
Perfect, thank you so much!
Received a valid push on git.bioconductor.org; starting a build for commit id: c7d55be57f00eadbf0eeddc41fe01d16c86f94c3
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Build System.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pathlinkR_0.99.366.tar.gz macOS 12.6.5 Monterey: pathlinkR_0.99.366.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pathlinkR
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: a747e89ef266a22a2c52c94fa3cfd49883e8e251
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Build System.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pathlinkR_0.99.370.tar.gz macOS 12.6.5 Monterey: pathlinkR_0.99.370.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pathlinkR
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
First round of comments to get things started... Please respond back with what has/not been addressed, or if anything's unclear- happy coding.
For future reference: when first submitted to Bioconductor, a package should have pre-release version 0.99.0. I don't think this will be possible to implement now, as decreasing the version will not trigger a new built... but please checkout the info provided here regarding correct versioning of Bioc packages.
major:
I wonder whether the package is interoperable with other existing Bioc infrastructure; specifically, the vignette is limited to DESeq
results- what about DESeq2
and edgeR
? In principle, the output of these frameworks is quite similar, and they should be supported. For example, eruption()
should expose arguments to specify logFC and (adjusted) p-value columns rather than hard-coding these and limiting the function to DESeq
only. Running the ?edgeR::topTags
example, I don't see what this type of output should not work just the same. I would urge you to make the package more flexible in order to work with other dominant frameworks by 1. exposing relevant parameters to the users, 2. determining the origin of the input programmatically (and adjusting function-behavior accordingly), and 3. failing when both is unsuccessful. (similarly, other functions could expose more arguments as to not enforce a very specific naming scheme of, e.g., pValueAdjusted
for pathways...)
Somewhat related to the above, Bioc's standard infrastructure for RNA-seq data is the S4 SummarizedExperiment
class (SE). Meanwhile, the entire package works with data.frame
s only and is written in tidyr
-style. Where applicable, I'd ask that users may also be able to provide a SE instead, in which case corresponding pieces of information (logFC etc.) are extracted from the rowData
. If the point above is taken care of, this should be simple to realize; i.e., determine what's needed (logFC, FDR, ...) > get it from the input (whatever that may be) > execute.
minor:
LazyData:
should be set to false
or removed (see here)URL:
field point to https://github.com/hancockinformatics/pathlinkRBiocManager::install()
?pathlinkR
), e.g., briefly summarizing the package's functionality, hyperlinking key function etc.data/
help-pages is incorrectly formatted, I believe; the long pieces of text should probably be preceded by a @description
tag (by default, the 1st line is taken as @title
)data/
are well described in terms of data structure, but source information is missing (see here for details)dontrun
flags in examples are generally not allowed, and if (FALSE)
in does just that in a sneaky way... (e.g., pathnetGGraph.R; overall I see it 4 times)BiocStyle
to hyperlink external packages, e.g., r BiocStyle::Biocpkg("DESeq2")
(there are analogous functions of packages hosted through CRAN and GitHub) ppiBuildNetwork()
and ppiPlotNetwork()
in lines 118+; this is sometimes, but not always done; consistency here would be nice tests/
, I am getting various warnings from devtools::test_coverage()
, which isn't great. Ideally, unit tests should run cleanly, and I'd highly recommend addressing these to the best of your abilities1:...
and use seq_len/along()
instead (ppiExtractSubnetwork.R line 168)Thanks for the feedback and suggestions! I'll get to work on making changes, and come back with updates/questions.
I have made (nearly) all the changes as outlined in the "minor" section, and will start working on the "major" changes r.e. input classes.
For the warnings from devtools::test()
, I've fixed what I can, but there are a few I don't think I can fix, outlined below. Open to suggestions for these:
devtools::test()
. Any subsequent tests don't produce this warning.geom_label_repel()
or geom_text_repel()
): Some of the plotting functions with point labels (e.g. eruption()
) use NAs in the label column to prevent the plot from becoming too crowded, or because we only want to label a user-selected subset of points.size
aesthetic in this geom was deprecated in ggplot2 3.4.0: This seems to be coming from the ggraph
dependency, and appears when running their example, found here.Updated to-do list for the rest:
BiocStyle::Biocpkg("DESeq2")
(there are analogous functions of packages hosted through CRAN and GitHub)Thanks for addressing (many) earlier comments - here a revised review of what's been addressed so far (last one being more of a comment/thinking out loud than critique):
[ ] in the DESCRIPTION, BugReports: https://github.com/hancockinformatics/pathlinkR
should be .../issues
, I think
[ ] for the time being, Bioc installation instructions should be BiocManager::install("pathlinkR", version="devel")
[ ] at the moment, the package help page is not found (what I'd call) intuitively, i.e., via ?pathlinkR
- I think it'd be worth adding @aliases pathlinkR
to the roxygen header in pathlinkR.R to make this so
[ ] the data-raw
scripts should be placed under extdata/inst/scripts
looking at this information on example data, I am not entirely sure where the line is drawn between raw data, and data which has been loaded/parsed; but I guess the data/
may well go into inst/extdata
(i.e., you are pulling these from databases not, say, making them up/simulating).
Not 100% familiar with the different data types involved here, but it looks to me like the workflow to generate some of these examples is quite complex... We recommend to find existing data provided in other packages or the hubs; would this be an option here? Perhaps something to consider.
Going back to your "major" comments r.e. input classes - I have tweaked the code so it can now accept "DESeqResults" (from DESeq2
), or "TopTags" (from edgeR
); in each case, column names for fold changes and p values are determined based on the class. Or, the input can be a simple "data.frame", with arguments to specify the requisite columns, allowing for greater flexibility of inputs.
I looked into the SummarizedExperiment
class as you mentioned (primarily their vignette), but it seems that its intended for RNA-Seq count data, whereas the other classes ("DESeqResults" and "TopTags", which aren't derived from "SE") are tables of DE genes, which are "downstream" from said count data. From what I could find, the "SE" class is used only for earlier stages in the RNA-Seq analysis process than what we're targeting with pathlinkR
. If I've missed something here please let me know, and if there are examples of DE gene tables stored as "SE" objects that'd be especially helpful.
Other minor points I've addressed:
Agreed, if the whole package operates on downstream results only - supporting both DESeq(2)
and edgeR
, as well as arbitrary data.frame
s with corresponding results (as specified by optional function arguments) - then all is well in terms of interoperability, I think. Perhaps I am missing something, but did you merge/push these changes? I still do not see the flexibility you mentioned implemented (e.g., error: all(c("padj", "log2FoldChange") %in% colnames(deseqResult))
).
I just pushed those updates now (along with a some other changes), if you want to take a look.
Regarding your comments on the data:
inst/script
as suggested, but if possible I'd like to keep the data objects in data
, if for no other reason than making them more readily available to the package, and also the end user.reactome.db
and msigdbr
), but with some changes or additions being made. So I'm not sure about finding a drop-in replacement for these...Following up on the previous comments:
inst/script
as requestedAll these changes have been pushed to the Github and Bioconductor repositories.
Apologies for the delay, but I didn't get notified about a new build - so just to make sure, did you push with a version bump? Looking at your local branch, I don't see the DESCRIPTION changing versions, hence no build report/notification here on GH - but not sure if the recent release of Bioc 3.18 could have cause any issues (I don't think it should).
*Perhaps just to note that I do see the changes you made (appreciated!), and it's looking great. But at this point need a clean build and check before finalizing...
I hadn't pushed a version bump yet, figured I'd wait until most of the changes were settled first. But I will do that today and make sure everything is passing.
Received a valid push on git.bioconductor.org; starting a build for commit id: 2912b778d7536e699a52b8b614ecfc42f175f8ef
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Build System.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pathlinkR_0.99.380.tar.gz Linux (Ubuntu 22.04.2 LTS): pathlinkR_0.99.380.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pathlinkR
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Thanks for all your efforts in addressing review comments! Appreciate especially the changes made towards more flexible handling of inputs towards interoperability!
Your package has been accepted. It will be added to the Bioconductor nightly builds.
Thank you for contributing to Bioconductor!
Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.
Of course, thanks for all your help and patience :) If there's anything else just let me know.
The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/travis-m-blimkie.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("pathlinkR")
. The package 'landing page' will be created at
https://bioconductor.org/packages/pathlinkR
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.