Closed jaspershen closed 7 years ago
Hi @jaspershen
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: MetCleaning
Type: Package
Title: MetCleaning
Version: 1.1.0
Date: 2016-12-13
Author: Xiaotao Shen, Zhengjiang Zhu
Maintainer: Xiaotao Shen<shenxt@sioc.ac.cn>
Description: Automatic and integrated tool for data cleaning and statistical analysis of large scale MS-based metabolomics data.
LazyData: true
Depends: R (>= 2.10)
Imports:
ggplot2,
pcaMethods,
impute,
missForest,
scatterplot3d,
pheatmap,
beeswarm,
ellipse,
pls,
plsdepot,
pROC,
e1071,
snow,
methods,
pathifier,
xlsx,
graphics,
grDevices,
utils
License: GPL-2 | file LICENSE
URL: https://github.com/jaspershen/MetCleaning
RoxygenNote: 5.0.1
Suggests: knitr,
rmarkdown
VignetteBuilder: knitr
Note:
Your package has been approved for building. Your package is now submitted to our queue.
IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/MetCleaning_buildreport_20161228154513.html
Hi LiNk-NY, thank you for reviewing my package. I have revised the most of the errors and warnings in the package. However, in my local computer (win10), I can't run BiocCheck. So I don't know whether the errors and warnings have been removed. Thanks.
The diagram in your vignette shows data potentially coming from xcms, but through a CSV file. Since xcms is a Bioconductor package, why not accept xcms data structures, and perhaps data structures from other packages, as there is a mature ecosystem?
Hi lawremi, metabolomics data analysis can be divided into several steps, first step is data processing (XCMS or mzMind), which transform MS raw data into dataset (matrix) which can be used for sequence statistical analysis. However, before statistical analysis, the data need to accept data pre-processing (or data preparation, data cleaning) to remove the system variations. So the package MetCleaning is used to do the data pre-processing. I don't integrate XCMS into MetCleaning because there are a lot of parameters should be optimized in the XCMS based on experiment. Thanks
All I am suggesting is that the user can pass the output of XCMS to your functions directly, without going through any external files. I wasn't suggesting that your package wrap XCMS.
Hi, lawremi. I am very sorry to wrongly understand your suggestion. Another external file named "sample.information" is used to offer sample information such as injection order, batch and group information, which are necessary for QC based normalization, data integration and statistical analysis. So this file should be offered because those information can't be directly offered by XCMS. Thank you.
Why can't you use the phenoData
component of xcmsSet for that?
Hi, lawremi. You know that some users don't use XCMS to do processing of MS raw data, they may use mzMine, MS-DIAL or other commercial software (for example, Agilent's Profinder). In MetCleaning, we only need a metabolomics dataset which contains mz(mass to charge) and rt (retention time) information and we don't care which software they use. I just use the dataset from XCMS as a example, because in my lab we mainly use XCMS. However, MetCleaning can process dataset from any software. This is why I don't want to use so much functions in XCMS. Thanks
The goal would be to make getting data into your software as easy as possible. You could support xcms directly, and also support other types of input from other software.
Hi, lawremi, offer the sample.information.csv file is also easy for most of users.
Again, I didn't suggest to remove any functionality. Just to support a bridge from xcms, which provides standardized infrastructure for this type of data.
Hi, lawremi, I don't understand which type of data should be supported in MetCleaning? mzXML? Now, MetCleaning only need a metabolomics dataset which contains mz and rt information from xcms, mzMine or any other software, which is easy enough for most of the users who want to do data preparation and statistical analysis. You can see the help document in the MetCleaning. Thanks
I was proposing to support the in-memory xcmsSet data structure as input.
Thank you for your suggestion. In the next version, I will consider to support xcmsSet data structure as input. In this version, because the input mode is easy enough for most of users (my teammates and friends have used it for about three months), so I don't want to change its input mode. Do you think is that OK? Thanks
No, I don't, because then you'll probably never change it, since users will just go along with it, busy with their work and not thinking about how things could be easier. Sorry, that might sound cynical but it's motivated by idealism.
OK, I think it will take me a lot of time to learn to how to use xcms.......In fact, I don't think it is easier by using the xcmsSet. Thanks again.
Hi, how can I close this issue and submit it again? Thanks
Hello, I have change the package. So can you review it again? Thank you very much.
Hi @jaspershen Can you bump the package version ?
Hi, thank your for your reply. And the version of package has been changed to 0.99.1. Thank you very much
Hi. Can you review my new version package? I have changed it to version 0.99.1. Thank you very much! Xiaotao
Hi Xiaotao @jaspershen, I don't see any SPB build history for a recent version of the package. It may be possible that you have to create a webhook again for the build system to pick it up. Regards, Marcel
Hi Xiaotao @jaspershen, Please re-establish the webhook for your package. See this webhook link for more details.
Regards, Marcel
Hi Marcel, @LiNk-NY Thank you very much for your reply and guide. I have re-established the webhook for my package-MetCleaning according to the user guide. Best regards Xiaotao
Received a valid push; starting a build. Commits are:
8dd4ec7 change description
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170310205814.html
Hi @jaspershen,
There are a few issues with the check
on your package:
Undefined global functions or variables:
qc qcorder sampleorder
Regards, Marcel
Hi @LiNk-NY Thank you for your reversion. I am fixing those bugs now. It may take me several days, because I don't understand why some issues appear. Fot the errores which apper, I have some questions (1) The vignette is built using devtools package, so I don't know why this is an error? (2) The package is more than 15M because the demo data is very large, because this package is used to process large-scale metabolomics data. So I don't know shouled I remove this demo data? (3) I am writing examples for documentation to make sure that more than 80% functions have examples. (4) I have registered in the bioc-devel and added my e-mail address in the mailing list. I will fix all those error this week and re-submit it again. Thanks again for you reversion and guide. Best regards, XIaotao
Hi Xiaotao @jaspershen,
I hope to see the changes soon. To answer your questions:
Please keep in mind that you should have runnable code in your vignette. Please don't create a vignette with all code chunks set to eval = FALSE
. Set them to eval = TRUE
for most chunks unless you have a good reason for why they should not be evaluated. Thanks.
Regards, Marcel
HI @LiNk-NY Thank you very much. All of the error will be fixed and I will be resubmit it today. Thanks again. Regards Xiaotao
Received a valid push; starting a build. Commits are:
03a8a5b change for 0.99.3
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ABNORMAL". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170320230831.html
Hi @LiNk-NY I have fixed all the errors and warnings which I got from BiocCheck in local. (1) More than 80% of exported functions have runable examples. (2) The demo data is less than 1 M now. (3) All code chunks have been set to eval = TRUE. (4) My email address-shenxt@sioc.ac.cn has been added in the bioc-devel mailing list. I have changed the version to 0.99.3 and pushed it to github. Thank you very much. Best regards Xiaotao
Hi Jasper @jaspershen,
Thank you for submitting MetCleaning
to Biconductor.
Please see the review below.
MetCleaning
provides several tools for working with metabolomics data
including pre-processing steps and analysis.
Your package doesn't seem to work well with Bioconductor infrastructure already in place. There are at least a couple of packages that work with the same data type as mentioned by @lawremi. Please strongly consider making it easier for your package to work well with the existing Bioconductor software.
It seems like you don't use any of the S4 classes and methods see here() for your package. You could make use of this system in order to create a more robust class with a formal definition and validity checks.
Without a formal class definition, it is harder to extend your class.
1:n
style iterations more info()ordered.qc
instead of
qc.has.order
."."
) rather than NULL to avoid
doing path <- getwd()
donttest
flags in your examples. Your examples should be runnableMetFlowData
should be a formal S4 class with a constructor function and
accessors - your operations rely to much on data at a specific index rather
than an imposed structure that the S4 system providesExportData
function: hasQC
should just save qc.info
)MetFlowData
class not availablehtml_document
!!!
)Regards, Marcel
Update: I will figure out why your package was tagged as ABNORMAL
.
Received a valid push; starting a build. Commits are:
b1f5109 default path set as "." 5431b17 Avoid 1:n style iterations. eb565bf qc.has.order. is changed to ordered.qc. 7528d78 Remove static images in the vignette. d3111d4 Remove excitement in error messages (!!!) 7ddf0bb Change code which go over 80 columns. 54a29c0 Change MetFlowData as S4 class. 05ecc7c re document b12da05 change met.data b8c7f2a fix bugs cbe50ed fix bugs db507b0 fix bugs 3b0f6ce fix bugs
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170322060601.html
Dear Marcel, Thank your very much for your review of my package. According to your advices, I have re-write the package today. The detailed information is listed below: Major Functionality & Integration (1) My package is used to do pre-processing of larege-scale metabolomics data. (2) I have changed MetFlowData to S4 class. (3) The MetFlowData has been changed to S4 class. Grammar (1) The 1:n style iteration have been rmoved. (2,3) MetCleaning do several processing and output results for each process, so the folders shouled be created to save results. It make the results more clear. (4) According to your suggestion, I have changed the argument names which are not clear. (5) All the path <- NULL have been changed according to your suggestion. Thanks. (6) I have re-written some uncaler documents of functions. (7) When running MetCleaning and MetStat, some processings should be type the selection, for example, when doing PLS-DA, the best component number should be selected by users. So, when I remove the "donttest" from the examples of MetCleaning and MetStat, there is a error when using Build Check. I find a lot of solutions for it in goodle, but I didn't find the solution methods. I am sorry about it. But the examples of MetCleaning and MetStat can be runable when I paste them manually. Structure (1) The MetFlowData has been changed to S4 class. (2) This has been changed. (3) MetFlowData has been changed to S4 calss, so I think this may be solved. Size (1) All the R scripts are in the "R" folder. and I am sure the packge was built following the gudelines. Vignettes (1) The most important functions in MetCleaning are MetCleaning and MetStat, the runable examples of them have been provided in the vignette. Other exported functions such as SXTpate are not useful for users, so their examples are not provided. (2) The workflow and data organization are significant for user to understant this package and usage. So they are remained, other 6 figures are removed from the vignette. (3) Only MetCleaning and MetStat are useful for users who want to do data processing and statistical analysis. So one vignette is enough to provide the usage of them. (4) I am sorry about it , but I don't understand how to pick htm_document of vignette? I will be grateful if I can get help. Thanks. Unit Tests (1) The unit tests has been added into the package.
Minor Grammar (1) The excitement in error messages (!!!) have been removed. Formatting (1) Most of code which go over 80 column have been changed. (2) I checked the longest functions and removed the space line from it. And all the functions only do one project.
The version has been changed to 0.99.4 and the package has been pushed to github. Thanks again for you review and advicements! Regards Xiaotao
Hi Xiaotao, @jaspershen
A few of the points I mentioned in the review were not addressed.
The major point about integration with Bioconductor is still lacking and there seems to be no effort on your part to bridge your software with Bioconductor. You may wish to not make any changes in that front but keep in mind you are submitting your software to Bioconductor so it is expected in every submission to inter-operate with, enhance, or use existing Bioconductor software. This is a major factor in the review process.
It's good that you're using S4 but you need to make sure that your MetFlowData
class has a constructor function that does some checks before creating the class. You may also want to include some validity checks to make sure that data inside the class are in order needed.
You would have to create accessor method functions so to avoid the use of @
throughout your functions. Usually these functions have the same name as the slot in the object.
Reminder: Avoid the use of .
in function names.
Refer to: http://bioconductor.org/developers/package-guidelines/#classes
Some points were not entirely addressed:
1:n
style iterations
path = NULL
arguments
Documentation for some arguments is not descriptive (e.g., @param MetFlowData
)
write.csv
is used extensively, please don't create files unless the user asks to create files. Create some argument for ensuring this.
All your exported functions should be testable. You can select defaults for the examples and still run them.
Don't export functions that users will not be using. Keep those un-exported and available only to you as the developer.
For the vignette: Select html_document
as the only value for output:
.
Please make the vignette:
value as
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
If find it difficult to see how a user would want to use a function with more than 20+ arguments as in MetFlowData
.
Regards, Marcel
Received a valid push; starting a build. Commits are:
1746cf0 fix bug in DataIntegration
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170503014509.html
Hi Jasper @jaspershen,
Do you have any updates on making your package compatible with the Bioconductor ecosystem? Otherwise, I'd be forced to close the issue.
Best regards, Marcel
Your issue has been closed due to inactivity. To submit your revised package to Bioconductor, please create another issue. Thank you. -Marcel
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.