Open wkumler opened 1 month ago
Hi @wkumler
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: squallms
Type: Package
Title: Speedy quality assurance via lasso labeling for LC-MS data
Version: 0.99.0
Authors@R: c(
person(given = "William", family = "Kumler", email = "wkumler@uw.edu",
role = c("aut", "cre", "cph"), comment=c(ORCID="0000-0002-5022-8009"))
)
Description: squallms is a Bioconductor R package that implements a
"semi-labeled" approach to untargeted mass spectrometry data. It pulls in raw
data from mass-spec files to calculate several metrics that are then used to
label MS features in bulk as high or low quality. These metrics of peak quality
are then passed to a simple logistic model that produces a fully-labeled
dataset suitable for downstream analysis.
License: MIT + file LICENSE
URL: https://github.com/wkumler/squallms
BugReports: https://github.com/wkumler/squallms/issues
Encoding: UTF-8
RoxygenNote: 7.3.1
biocViews: MassSpectrometry, Metabolomics, Proteomics, Lipidomics, ShinyApps,
Classification, Clustering, FeatureExtraction, PrincipalComponent, Regression,
Preprocessing, QualityControl, Visualization
Depends:
R (>= 3.5.0)
Imports:
xcms,
MSnbase,
RaMS,
dplyr,
tidyr,
tibble,
ggplot2,
shiny,
plotly,
data.table,
caret,
stats,
graphics,
grDevices,
utils
Suggests:
knitr,
rmarkdown,
BiocStyle,
testthat (>= 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
Please include an inst/scripts
that has a file that describes how the data in inst/extdata
was generated. It can be text, psuedo-code, or code but should contain any relevant source and licensing information.
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/squallms
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 9115a56d52d5371613242f22093329a3208a7f5e
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): squallms_0.99.1.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/squallms
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: f3c495c697f29acb0665a6e1ce8a73b11cec492e
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): squallms_0.99.2.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/squallms
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: d62e83477270529f4f08b48b98f696875cc544e4
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): squallms_0.99.3.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/squallms
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
Hi @wkumler,
Just a heads up that I won't begin the review until after the forthcoming BioC 3.19 release (scheduled for May 1). This is because the package was submitted after the deadline for inclusion in BioC 3.19 (hence the 'Post Release Deadline' tag on this issue).
Thanks for your understanding, Pete
Hi Pete,
Yeah, I figured! Was hoping to sneak in under the code freeze deadline but totally understandable that you'd like to give it more review time. I appreciate the work you put in on Bioconductor!
-Will
Hi @wkumler,
Thank you for submitting squallms to Bioconductor and thanks for your patience awaiting this review.
Overall, the package is in good shape and close to being ready for acceptance. My main suggestions are around that I think the vignette could be improved to make it friendlier to new users. For context, I say this as someone who does not do any analysis of mass spectrometry data but who is familiar with the general concepts.
In my review below I have separated the issues into Required and Recommended points that I would ask you to address before the package can be accepted. Would you please provide line-by-line comments to my initial review so that I know what changes I'm looking for in my re-review.
Cheers, Pete
labelFeatsManual()
did not work on my system (session info below). What computers or operating systems have you tested this on?# Using the example from `?labelFeatsManual
> library(squallms)
> library(xcms)
> library(dplyr)
> library(MSnbase)
> mzML_files <- system.file("extdata", package = "RaMS") %>%
+ list.files(full.names = TRUE, pattern = "[A-F].mzML")
> register(BPPARAM = SerialParam())
> cwp <- CentWaveParam(snthresh = 0, extendLengthMSW = TRUE, integrate = 2)
> obp <- ObiwarpParam(binSize = 0.1, response = 1, distFun = "cor_opt")
> pdp <- PeakDensityParam(
+ sampleGroups = 1:3, bw = 12, minFraction = 0,
+ binSize = 0.001, minSamples = 0
+ )
> xcms_filled <- mzML_files %>%
+ readMSData(msLevel. = 1, mode = "onDisk") %>%
+ findChromPeaks(cwp) %>%
+ adjustRtime(obp) %>%
+ groupChromPeaks(pdp) %>%
+ fillChromPeaks(FillChromPeaksParam(ppm = 5))
Detecting mass traces at 25 ppm ... OK
Detecting chromatographic peaks in 198 regions of interest ... OK: 147 found.
Detecting mass traces at 25 ppm ... OK
Detecting chromatographic peaks in 232 regions of interest ... OK: 166 found.
Detecting mass traces at 25 ppm ... OK
Detecting chromatographic peaks in 218 regions of interest ... OK: 176 found.
Sample number 2 used as center sample.
Aligning LB12HL_AB.mzML.gz against LB12HL_CD.mzML.gz ... OK
Aligning LB12HL_EF.mzML.gz against LB12HL_CD.mzML.gz ... OK
Applying retention time adjustment to the identified chromatographic peaks ... OK
[===============================================================================================================================================================] 100/100 (100%) in 0s
Defining peak areas for filling-in .... OK
Start integrating peak areas from original files
Requesting 57 peaks from LB12HL_AB.mzML.gz ... got 52.
Requesting 46 peaks from LB12HL_CD.mzML.gz ... got 45.
Requesting 41 peaks from LB12HL_EF.mzML.gz ... got 40.
> peak_data <- makeXcmsObjFlat(xcms_filled)
> if(interactive()){
+ manual_labels <- labelFeatsManual(peak_data)
+ }
Grabbing raw MS1 data
|===============================================================================================================================================================================| 100%
Total time: 0.32 secs
Error in setGraphicsEventEnv(which, as.environment(list(...))) :
this graphics device does not support event handling
[ ] When using labelFeatsLasso()
it's not clear to me what's being plotted in the bottom panels because there are no axis labels.
[ ] In the 'Lasso labeling with labelFeatsLasso' section of the vignette:
The vignette makes use of pre-saved results (in inst/extdata
) but the user is not shown how or when these are loaded in the vignette. Consequently, it's impossible for the user to replicate the vignette output by following the vignette. I understand the necessity of using saved results for some of the interactive selections, but you need to pull back the curtain to explain why this is being done and how. Otherwise the chunks following these selection are not reproducible unless the user has the exact same lasso_classes
and manual_classes
variables as you created.
[ ] In the 'Building a quality model and removing low-quality peaks' section of vignette.
if (interactive())
saving of plot in the vignette?[ ] The verbosity
argument of extractChromMetrics()
doesn't seem to work as documented (see below):
suppressPackageStartupMessages(library(squallms))
suppressMessages(example("extractChromMetrics", "squallms", echo = FALSE))
# Where are the plots that are supposed to appear with verbosity = 2?
feat_metrics <- extractChromMetrics(peak_data, recalc_betas = TRUE, verbosity = 2)
#> Grabbing raw MS1 data
#> | | | 0%
#> Reading file LB12HL_AB.mzML.gz... 0.07 secs
#> Reading MS1 data...0.04 secs
#> | |======================= | 33%
#> Reading file LB12HL_CD.mzML.gz... 0.07 secs
#> Reading MS1 data...0.04 secs
#> | |=============================================== | 67%
#> Reading file LB12HL_EF.mzML.gz... 0.13 secs
#> Reading MS1 data...0.04 secs
#> | |======================================================================| 100%
#> Binding files together into single data.table
#> Total time: 0.39 secs
#> Recalculating beta coefficients
?logModelFeatProb
should explain why there is an RDS file is loaded and what it contains. Does this example really need dplyr (could use of |>
instead of %>%
simplify it?)?R/labelFeatsFunctions.R
(see covr::report()
for other places lacking coverage).|>
instead of the magrittr pipe operator %>%
? set.seed()
in certain chunks. inst/CITATION
file (see https://contributions.bioconductor.org/citation.html)BiocCheck()::BiocCheck()
. These are only suggestions but are worth consideration.Hi @PeteHaitch, Thank you for the detailed review and thoughtful comments on the package! I have implemented essentially all of your recommendations but could use some advice on the last few.
Received a valid push on git.bioconductor.org; starting a build for commit id: acd422f235dbcc938ad7285d587197f6966ca19f
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): squallms_0.99.4.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/squallms
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi @wkumler,
Thank you for making the requested changes and considering the recommended ones. I think the package is much improved, particularly the vignette which is much easier for new users with the new intro, figure/table captions, better reproducibility, and better and more detailed text explanations and guides to interpretation. It's very nearly ready for acceptance.
First, some responses to your queries/comments.
- labelFeatsManual() did not work on my system (session info below). What computers or operating systems have you tested this on?
- This is a tough one. I tested on Windows and Linux but apparently Macs have problems with X11 graphics. I wasn't able to find any good workarounds for this in the time I spent on the project - it seems possible that calling X11() instead of dev.new() to start a graphics device with event handling might work if XQuartz is installed (not by default for Macs since 2016) but I'd rather not add a hardware dependency like that if possible. At this point, it seems like my options are making it clear that this function is limited to Windows/Linux systems or reformatting the thing into a Shiny app like labelFeatsLasso currently is. Is there an option preferred by Bioconductor? Perhaps noting that Macs will be unable to use this (and adding a warning) is an acceptable compromise for now with intent to restructure into a Shiny app when I've got more bandwidth?
I think documenting the potential workaround on macOS (see below, and trying to get the backspace and escape keybindings working on macOS) is an okay interim measure until/if you re-implement as a shiny app. I'll ask @Bioconductor/core to chime in whether platform-specific functionality is okay in this case. That said, I was able to partially get it to work on my macOS system with the following (please see the comments below for issues I still had):
library(squallms)
example("labelFeatsManual", echo = FALSE)
# Either of these options worked the same for me.
# options(device = function() X11(type = "Xlib"))
options(device = function() X11(type = "cairo"))
# Left, right, up keys work as expected.
# However, backspace ('delete' on macOS keyboard) and escape ('esc') gave 'Warning in labelFeatsManual(peak_data) : Unrecognized input, skipping'.
# It also sometimes crashed my R session when trying to stop labelling (e.g. by clicking stop in Rstudio or closing graphics window).
manual_labels <- labelFeatsManual(peak_data)
- Good job providing some unit tests. Have a think if it's possible to test code R/labelFeatsFunctions.R (see covr::report() for other places lacking coverage).
- Another toughie. I struggled to write unit tests for the interactive sections for obvious reasons, but they clearly would've caught the X11 issue described above so they might be worth implementing a little more carefully.
That's fair enough. Know that upon acceptance that your package will be built and checked on Windows (x86_64), macOS (x86_64 and arm64), and Linux (Ubuntu; x86_64 and arm64), so that might help you check cross-platform functionality on systems you don't have access to.
- Can you use the base R pipe operator |> instead of the magrittr pipe operator %>%?
- This is possible but a bit of a headache. Is there a reason to prefer the base R version when I'm already loading dplyr for other functions?
It's mostly personal preference (I like to use things already included in R before using things from packages), so it's fine to retain as is.
I did want to ask about the recommendation to bump the R version dependency from 3.5.0 to 4.4.0 - I realize that's necessary for Bioconductor but I'd rather folks be able to use my package (e.g. directly from Github) without requiring the latest version.
You can keep it as is.
But please remember that Bioconductor strongly recommend the primary installation instructions be BiocManager::install("squallms")
rather than installing from Github.
In particular, using BiocManager::install("squallms")
will ensure the user has/gets the appropriate versions of all dependencies whereas our experience is that users installing from GitHub can often lead to incompatible software dependencies from different versions of Bioconductor and elsewhere.
Second, there's a a small number of new minor things to fix.
#fig:plot metrics
in rendered vignette). In this case, chunk labels should not contain spaces, so you'll want to use something like plot-metrics
instead of plot metrics
(see https://yihui.org/knitr/options/#chunk-options).eval=FALSE
chunks, e.g., the manual_classes <- labelFeatsManual(feat_peak_info, ms1_data = msdata$MS1, selection = "Labeled", existing_labels = lasso_classes)
manual_classes
vector for subsequent steps, which might not be ideal because subsequent steps will give different results, so perhaps note that?knitr::include_graphics(file.path(here::here(), "vignettes", "intro_good_ss.png"), rel_path = FALSE
).Received a valid push on git.bioconductor.org; starting a build for commit id: e75460cfda53bdefff030cdfb9808cdd8877b9f9
Hi @PeteHaitch,
Thanks again for the thoughtful commentary and support during this submission process! I've now submitted a new version of the package in which I bite the bullet and pivot the labelFeatsManual function into a Shiny app to avoid having to deal with the platform-specific problems. I was able to test this app on both my own Windows laptop and a colleague's Mac so I'm optimistic about your experience with it now.
I agree about the best user experience likely coming from the BiocManager::install("squallms")
instead of a Github install and will change that part of the README once the package is accepted. However, I have left the hard dependency at 3.5.0 so that advanced users will be able to use older versions if necessary.
Both Required bullets have been resolved - nice catch with the #fig: problem, you were correct about it being due to a space in the chunk name. I've added code comments to both places where I call an interactive chunk with eval=FALSE
and expanded a bit of the text nearby to hopefully provide some additional context as well.
Both Recommended bullets have been resolved - I ended up removing the line about the XCMS processing history and the associated functionality until https://github.com/sneumann/xcms/issues/735 can be better resolved.
Thanks @wkumler. FYI I'm at a conference next week, but I hope to find time to quickly go through these changes and accept the package but apologies if I'm delayed.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of mass-spectrometry data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.