Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

(inactive) MetCleaning #227

Closed jaspershen closed 7 years ago

jaspershen commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @jaspershen

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: MetCleaning
Type: Package
Title: MetCleaning
Version: 1.1.0
Date: 2016-12-13
Author: Xiaotao Shen, Zhengjiang Zhu
Maintainer: Xiaotao Shen<shenxt@sioc.ac.cn>
Description: Automatic and integrated tool for data cleaning and statistical analysis of large scale MS-based metabolomics data.
LazyData: true
Depends: R (>= 2.10)
Imports:
    ggplot2,
    pcaMethods,
    impute,
    missForest,
    scatterplot3d,
    pheatmap,
    beeswarm,
    ellipse,
    pls,
    plsdepot,
    pROC,
    e1071,
    snow,
    methods,
    pathifier,
    xlsx,
    graphics,
    grDevices,
    utils
License: GPL-2 | file LICENSE
URL: https://github.com/jaspershen/MetCleaning
RoxygenNote: 5.0.1
Suggests: knitr,
    rmarkdown
VignetteBuilder: knitr
Note:
bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/MetCleaning_buildreport_20161228154513.html

jaspershen commented 7 years ago

Hi LiNk-NY, thank you for reviewing my package. I have revised the most of the errors and warnings in the package. However, in my local computer (win10), I can't run BiocCheck. So I don't know whether the errors and warnings have been removed. Thanks.

lawremi commented 7 years ago

The diagram in your vignette shows data potentially coming from xcms, but through a CSV file. Since xcms is a Bioconductor package, why not accept xcms data structures, and perhaps data structures from other packages, as there is a mature ecosystem?

jaspershen commented 7 years ago

Hi lawremi, metabolomics data analysis can be divided into several steps, first step is data processing (XCMS or mzMind), which transform MS raw data into dataset (matrix) which can be used for sequence statistical analysis. However, before statistical analysis, the data need to accept data pre-processing (or data preparation, data cleaning) to remove the system variations. So the package MetCleaning is used to do the data pre-processing. I don't integrate XCMS into MetCleaning because there are a lot of parameters should be optimized in the XCMS based on experiment. Thanks

lawremi commented 7 years ago

All I am suggesting is that the user can pass the output of XCMS to your functions directly, without going through any external files. I wasn't suggesting that your package wrap XCMS.

jaspershen commented 7 years ago

Hi, lawremi. I am very sorry to wrongly understand your suggestion. Another external file named "sample.information" is used to offer sample information such as injection order, batch and group information, which are necessary for QC based normalization, data integration and statistical analysis. So this file should be offered because those information can't be directly offered by XCMS. Thank you.

lawremi commented 7 years ago

Why can't you use the phenoData component of xcmsSet for that?

jaspershen commented 7 years ago

Hi, lawremi. You know that some users don't use XCMS to do processing of MS raw data, they may use mzMine, MS-DIAL or other commercial software (for example, Agilent's Profinder). In MetCleaning, we only need a metabolomics dataset which contains mz(mass to charge) and rt (retention time) information and we don't care which software they use. I just use the dataset from XCMS as a example, because in my lab we mainly use XCMS. However, MetCleaning can process dataset from any software. This is why I don't want to use so much functions in XCMS. Thanks

lawremi commented 7 years ago

The goal would be to make getting data into your software as easy as possible. You could support xcms directly, and also support other types of input from other software.

jaspershen commented 7 years ago

Hi, lawremi, offer the sample.information.csv file is also easy for most of users.

lawremi commented 7 years ago

Again, I didn't suggest to remove any functionality. Just to support a bridge from xcms, which provides standardized infrastructure for this type of data.

jaspershen commented 7 years ago

Hi, lawremi, I don't understand which type of data should be supported in MetCleaning? mzXML? Now, MetCleaning only need a metabolomics dataset which contains mz and rt information from xcms, mzMine or any other software, which is easy enough for most of the users who want to do data preparation and statistical analysis. You can see the help document in the MetCleaning. Thanks

lawremi commented 7 years ago

I was proposing to support the in-memory xcmsSet data structure as input.

jaspershen commented 7 years ago

Thank you for your suggestion. In the next version, I will consider to support xcmsSet data structure as input. In this version, because the input mode is easy enough for most of users (my teammates and friends have used it for about three months), so I don't want to change its input mode. Do you think is that OK? Thanks

lawremi commented 7 years ago

No, I don't, because then you'll probably never change it, since users will just go along with it, busy with their work and not thinking about how things could be easier. Sorry, that might sound cynical but it's motivated by idealism.

jaspershen commented 7 years ago

OK, I think it will take me a lot of time to learn to how to use xcms.......In fact, I don't think it is easier by using the xcmsSet. Thanks again.

jaspershen commented 7 years ago

Hi, how can I close this issue and submit it again? Thanks

jaspershen commented 7 years ago

Hello, I have change the package. So can you review it again? Thank you very much.

LiNk-NY commented 7 years ago

Hi @jaspershen Can you bump the package version ?

jaspershen commented 7 years ago

Hi, thank your for your reply. And the version of package has been changed to 0.99.1. Thank you very much

jaspershen commented 7 years ago

Hi. Can you review my new version package? I have changed it to version 0.99.1. Thank you very much! Xiaotao

LiNk-NY commented 7 years ago

Hi Xiaotao @jaspershen, I don't see any SPB build history for a recent version of the package. It may be possible that you have to create a webhook again for the build system to pick it up. Regards, Marcel

LiNk-NY commented 7 years ago

Hi Xiaotao @jaspershen, Please re-establish the webhook for your package. See this webhook link for more details.

Regards, Marcel

jaspershen commented 7 years ago

Hi Marcel, @LiNk-NY Thank you very much for your reply and guide. I have re-established the webhook for my package-MetCleaning according to the user guide. Best regards Xiaotao

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

8dd4ec7 change description

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170310205814.html

LiNk-NY commented 7 years ago

Hi @jaspershen, There are a few issues with the check on your package:

Regards, Marcel

jaspershen commented 7 years ago

Hi @LiNk-NY Thank you for your reversion. I am fixing those bugs now. It may take me several days, because I don't understand why some issues appear. Fot the errores which apper, I have some questions (1) The vignette is built using devtools package, so I don't know why this is an error? (2) The package is more than 15M because the demo data is very large, because this package is used to process large-scale metabolomics data. So I don't know shouled I remove this demo data? (3) I am writing examples for documentation to make sure that more than 80% functions have examples. (4) I have registered in the bioc-devel and added my e-mail address in the mailing list. I will fix all those error this week and re-submit it again. Thanks again for you reversion and guide. Best regards, XIaotao

LiNk-NY commented 7 years ago

Hi Xiaotao @jaspershen,

I hope to see the changes soon. To answer your questions:

  1. I checked your package locally and I didn't get that error. Perhaps it's a build system hiccup. We will see once you make changes to your package and bump the version. Ignore that error for now.
  2. You can run your examples on a smaller dataset. Your data does not have to be so big for the examples.
  3. Good.
  4. Good.

Please keep in mind that you should have runnable code in your vignette. Please don't create a vignette with all code chunks set to eval = FALSE. Set them to eval = TRUE for most chunks unless you have a good reason for why they should not be evaluated. Thanks.

Regards, Marcel

jaspershen commented 7 years ago

HI @LiNk-NY Thank you very much. All of the error will be fixed and I will be resubmit it today. Thanks again. Regards Xiaotao

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

03a8a5b change for 0.99.3

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ABNORMAL". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170320230831.html

jaspershen commented 7 years ago

Hi @LiNk-NY I have fixed all the errors and warnings which I got from BiocCheck in local. (1) More than 80% of exported functions have runable examples. (2) The demo data is less than 1 M now. (3) All code chunks have been set to eval = TRUE. (4) My email address-shenxt@sioc.ac.cn has been added in the bioc-devel mailing list. I have changed the version to 0.99.3 and pushed it to github. Thank you very much. Best regards Xiaotao

LiNk-NY commented 7 years ago

Hi Jasper @jaspershen, Thank you for submitting MetCleaning to Biconductor. Please see the review below.


MetCleaning #227

MetCleaning provides several tools for working with metabolomics data including pre-processing steps and analysis.

Major

Functionality & Integration

Grammar

Structure

Size

Vignettes

Unit Tests

Minor

Grammar

Formatting

Regards, Marcel

Update: I will figure out why your package was tagged as ABNORMAL.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

b1f5109 default path set as "." 5431b17 Avoid 1:n style iterations. eb565bf qc.has.order. is changed to ordered.qc. 7528d78 Remove static images in the vignette. d3111d4 Remove excitement in error messages (!!!) 7ddf0bb Change code which go over 80 columns. 54a29c0 Change MetFlowData as S4 class. 05ecc7c re document b12da05 change met.data b8c7f2a fix bugs cbe50ed fix bugs db507b0 fix bugs 3b0f6ce fix bugs

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170322060601.html

jaspershen commented 7 years ago

Dear Marcel, Thank your very much for your review of my package. According to your advices, I have re-write the package today. The detailed information is listed below: Major Functionality & Integration (1) My package is used to do pre-processing of larege-scale metabolomics data. (2) I have changed MetFlowData to S4 class. (3) The MetFlowData has been changed to S4 class. Grammar (1) The 1:n style iteration have been rmoved. (2,3) MetCleaning do several processing and output results for each process, so the folders shouled be created to save results. It make the results more clear. (4) According to your suggestion, I have changed the argument names which are not clear. (5) All the path <- NULL have been changed according to your suggestion. Thanks. (6) I have re-written some uncaler documents of functions. (7) When running MetCleaning and MetStat, some processings should be type the selection, for example, when doing PLS-DA, the best component number should be selected by users. So, when I remove the "donttest" from the examples of MetCleaning and MetStat, there is a error when using Build Check. I find a lot of solutions for it in goodle, but I didn't find the solution methods. I am sorry about it. But the examples of MetCleaning and MetStat can be runable when I paste them manually. Structure (1) The MetFlowData has been changed to S4 class. (2) This has been changed. (3) MetFlowData has been changed to S4 calss, so I think this may be solved. Size (1) All the R scripts are in the "R" folder. and I am sure the packge was built following the gudelines. Vignettes (1) The most important functions in MetCleaning are MetCleaning and MetStat, the runable examples of them have been provided in the vignette. Other exported functions such as SXTpate are not useful for users, so their examples are not provided. (2) The workflow and data organization are significant for user to understant this package and usage. So they are remained, other 6 figures are removed from the vignette. (3) Only MetCleaning and MetStat are useful for users who want to do data processing and statistical analysis. So one vignette is enough to provide the usage of them. (4) I am sorry about it , but I don't understand how to pick htm_document of vignette? I will be grateful if I can get help. Thanks. Unit Tests (1) The unit tests has been added into the package.

Minor Grammar (1) The excitement in error messages (!!!) have been removed. Formatting (1) Most of code which go over 80 column have been changed. (2) I checked the longest functions and removed the space line from it. And all the functions only do one project.

The version has been changed to 0.99.4 and the package has been pushed to github. Thanks again for you review and advicements! Regards Xiaotao

LiNk-NY commented 7 years ago

Hi Xiaotao, @jaspershen

A few of the points I mentioned in the review were not addressed.

The major point about integration with Bioconductor is still lacking and there seems to be no effort on your part to bridge your software with Bioconductor. You may wish to not make any changes in that front but keep in mind you are submitting your software to Bioconductor so it is expected in every submission to inter-operate with, enhance, or use existing Bioconductor software. This is a major factor in the review process.

It's good that you're using S4 but you need to make sure that your MetFlowData class has a constructor function that does some checks before creating the class. You may also want to include some validity checks to make sure that data inside the class are in order needed. You would have to create accessor method functions so to avoid the use of @ throughout your functions. Usually these functions have the same name as the slot in the object.

Reminder: Avoid the use of . in function names.

Refer to: http://bioconductor.org/developers/package-guidelines/#classes

Some points were not entirely addressed:

For the vignette: Select html_document as the only value for output:. Please make the vignette: value as

  %\VignetteIndexEntry{Vignette Title}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}

If find it difficult to see how a user would want to use a function with more than 20+ arguments as in MetFlowData.

Regards, Marcel

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

1746cf0 fix bug in DataIntegration

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/MetCleaning_buildreport_20170503014509.html

LiNk-NY commented 7 years ago

Hi Jasper @jaspershen,

Do you have any updates on making your package compatible with the Bioconductor ecosystem? Otherwise, I'd be forced to close the issue.

Best regards, Marcel

LiNk-NY commented 7 years ago

Your issue has been closed due to inactivity. To submit your revised package to Bioconductor, please create another issue. Thank you. -Marcel