Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

yamss (metabolomics) #150

Closed lmyint closed 8 years ago

lmyint commented 8 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 8 years ago

Hi @lmyint

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: yamss
Version: 0.99.0
Title: Tools for high-throughput metabolomics
Description: Tools to analyze and visualize high-throughput metabolomics data
    aquired using chromatography-mass spectrometry. These tools preprocess
    data in a way that enables reliable and powerful differential analysis.
Authors@R: c(
      person("Leslie", "Myint", role = c("cre", "aut"), email = "lmyint1@jhu.edu"),
      person(c("Kasper", "Daniel"), "Hansen", role = "aut"))
Depends:
    methods,
    BiocGenerics (>= 0.15.3),
    limma
Suggests:
    BiocStyle,
    knitr,
    rmarkdown,
    mtbls2
Imports:
    IRanges,
    stats,
    S4Vectors,
    EBImage,
    Matrix,
    mzR,
    data.table,
    grDevices
VignetteBuilder: knitr
License: Artistic-2.0
URL: https://github.com/hansenlab/yamss
BugReports: https://github.com/hansenlab/yamss/issues
biocViews: MassSpectrometry, Metabolomics, Software
lawremi commented 8 years ago

Looks like all of the functionality is wrapped up in the bakedpi() function (what does that name even mean?), while it would be nice to have each step of the pipeline exposed as its own function. That would mean defining a formal data structure for holding the raw data. The xcms package defines some that might be useful.

I noticed that getFileNames begins with get but the other accessors don't. Probably should drop the get.

lmyint commented 8 years ago

Hi Michael, We should have made clear in the man pages that bakedpi stands for bivariate approximate kernel density estimation for peak identification." We debated exporting the individual functions to avoid users having to learn too much about internal workings, but we'll revisit this. I think you're right about getFileNames. Thanks for the speedy look!

bioc-issue-bot commented 8 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20160926172336.html

kasperdanielhansen commented 8 years ago

This post (partly) addresses @lawremi comments on re-using XCMS as well as exposing each step of the pipeline.

We have looked (again) at the internals of XCMS. This package (also) has a design where raw data is mixed up with processed data, so it is not so easy to reuse. For example xcmsSet, which is the first class a user instantiates, does peak detection when it is created, which is a data processing step we don't use at all.

Also, XCMS processes each sample separately which allows them to loop over samples on disk, only accessing one sample at a time. This allows xcmsSet to simply have a pointer to a set of files on disk (because the raw data can be accessed on demand from the files), which is obviously great from a resource point of view. Our method processes all samples simultaneously, we do that by reading all the data into a data.table. This representation allows us to do fast two-dimensional subsetting in the M/Z and RT space, which XCMS doesn't need to do (not in the same way).

Conclusion: the only thing we share with XCMS is starting with a collection of filenames. So there is not really anything to re-use at the start of the process. We are obviously both depending on mzR.

But @lawremi point about bakedpi() being a monolithic entity which takes a set of files and returns a heavily processed object, is a point well taken. Some of the steps inside bakedpi() is uninteresting to the user, but it would be good to chop the process up a little bit. We'll do that, probably by exposing the return object of readRawDataAsDataTable() and wrapping it in a class, still designing the details of this. Hopefully we'll have a redesign tomorrow.

lmyint commented 8 years ago

Note that the error on Windows is due to mzR being unavailable on Windows. We don't have any compiled code, so we're pretty sure that our code should work on all platforms.

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

a4e7b17 CMSraw and CMSproc class. readRaw() to create CMSr... 3698a26 Update backgroundCorrection and slot names f2f5174 Update rtAlignment 14b307b Update densityEstimation 3e0f1f6 Update remainder of functions 2e5fd91 some further work on raw/proc classes 6c247a4 Add class checks. Fix slot and object names 9f1b729 Fix doc and doc examples 89fab0b Fix exports 9bd43a4 Forgot comma 6ac2816 Need to export DataFrame 2916eaa DataFrame to data.frame c616862 Slots as data.table 195731f setOldClass to get around undefined slot problem 6051b9d massive work 2b208c4 Add .maxScan 344a418 Use .rawDT 9f38a6e Fix getTIC b094272 Need tic global var cbb1b75 Update doc e93c9b8 various small fixes f12833c Fix getTIC and doc c95b849 Add man page for readMSdata 9e4a7b2 updated vignette 9211797 Merge branch 'devel' of github.com:hansenlab/yamss... 6830277 Small edits 959bd06 small fixes 0e1f653 Merge branch 'devel' of github.com:hansenlab/yamss... f04c7d0 version bump

kasperdanielhansen commented 8 years ago

Substantial changes:

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20160927231611.html

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

656c2c0 Fix bibtex entry fd9481b Version bump

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20160927235221.html

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

a4aa3bb Explain abbreviations ff5bab7 Make new CMSslice class and update CMSproc d11491c Move functions out of bakedpi to slicepi. Change s... 7a2103a Use one method for EICS+quantify. Make CMSslice ob... 1267777 Remove updatePeaks 1b51920 Add densityQuantiles as metadata 8947ef0 Update accessors ce88361 Update diffrep and doc 4507077 Remove updatePeaks from doc. Clean up args and des... 297d4d7 Add colData for CMSslice class 5dd4a8e Add unassociated accessors. Update descrip and exa... 26f5974 Add doc for CMSslice 20b7186 Update vignette to incorporate slicepi() 020bd47 Fix object names and writing 5c5020c Update bakedpi man and add slicepi man c12c92e Change example 5296a29 Delete comma e702925 Fix exports 5dee56f Fix calling peakBounds() on CMSproc 39e0385 Fix show method for CMSproc. Add one for CMSslice 1ddb53a many small fixes 2d5f27a fixed conflict 02c3aac Define qs for getting cutoff 077e966 Fix show() for CMSslice de0cb60 Fix return of getEICsAndQuantify 7cd7c64 fix typo in slicepi 3682cda Fix show method for CMSslice - accessing peakBound... b27acbe more typos 274557d merged conflicts 6091938 Fix scan.min and scan.max and rowData subsetting 10e5da1 Merge branch 'devel' of github.com:hansenlab/yamss... 75c2871 Need to call densityCutoff on CMSslice e11001b Bump version

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20160929131356.html

kasperdanielhansen commented 8 years ago

In our latest update we have split the analytical pipeline into two: bakedpi() and slicepi(). So in total, based on @lawremi we have split our original monolitic pipeline into

This completes our revisions based on @lawremi comments. We are not planning on any more revisions, until review.

The package fails on Windows because a dependency - mzR - is not available.

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

cb9080c version bump to trigger rebuild

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20161007211956.html

kasperdanielhansen commented 8 years ago

Well mzR now builds (but errors on CHECK) on Windows, but that has not made it available on the single package builder. Guess we will have to wait longer for this.

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

3f532ef version bump to trigger rebuild to see if mzR has ...

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20161009221547.html

kasperdanielhansen commented 8 years ago

Ok, so now mzR (a package dependency) is fixed, and there are no problems.

kasperdanielhansen commented 8 years ago

It would be great to get some movement on this, considering the deadline is the 12th, and we have been waiting for 11d. Of course I understand there is a lot of packages to look at.

lmyint commented 8 years ago

Hi @LiNk-NY In your recent reply on Bioc-devel to another package author, you had said that you would get to looking at all packages in good time before the 12th, but we haven't heard from you yet.

LiNk-NY commented 8 years ago

Hi @lmyint and @kasperdanielhansen, Thank you for submitting to Bioconductor!

My apologies for the delay. There are a few packages ahead of you in the queue. I took a brief look at it and here is some feedback.


DESCRIPTION

Unit Tests

Methods

Grammar

Vignettes

Documentation

Thank you for your patience.

Regards, Marcel

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

a92eeba Put density estimation steps into named functions a3adca4 Chunk scan portion of density computation bb95b7d Increase chunk size b7370a5 version bump to trigger rebuild d51b187 Add more accessor functions and use them dae4fd1 Merge branch 'devel' of github.com:hansenlab/yamss... e7791a3 Move limma into Imports from Depends a1c9f3c Undo testing in bakedpi density estimation 579ffd6 Fix version d337dbc Bump version 475e5ea Bump version

lmyint commented 8 years ago

Thank you for taking a look @LiNk-NY. We have addressed the issues regarding the import of limma and accessing slots with @. We're working on the others as well. Is it ok if we come back with changes by Monday?

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20161012161646.html

kasperdanielhansen commented 8 years ago

I looked closer at the release schedule: clearly it has to be done by Friday at the very latest.

All of the comments involve improving our own code. None of the issues affects the user experience nor the interoperability with other packages. We do agree with most of these suggestions, and will incorporate them.

On Wed, Oct 12, 2016 at 4:05 PM, Leslie Myint notifications@github.com wrote:

Thank you for taking a look @LiNk-NY https://github.com/LiNk-NY. We have addressed the issues regarding the import of limma and accessing slots with @. We're working on the others as well. Is it ok if we come back with changes by Monday?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/150#issuecomment-253323007, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhn_JSFs6fqqFpuaB03TaFBqpYZIcVks5qzT2agaJpZM4KG6jU .

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

9b96c7d Use if, else instead of ifelse for scalars c2bf3b8 Fix some instances of seq_along and rename by arg ... c738d1e Move utility functions to reading.R. Fix scanmin t... ec847d3 Add small example data and doc 67fc649 Add unit test for readMSdata 0ac67ef Add testthat to Suggests and fix test-read c7f5b0e Change name of example. Fix empty value in man 6b3214f Add minScan accessor 857c73b casting as integer in parsing function 904fb62 Use .minScan and seq_len in slicepi 65c78ca Merge branch 'devel' of github.com:hansenlab/yamss... a7bb84a added digest function 429cb02 adding testing for background correction b6c3db8 Use seq_len and add drop=FALSE d4a9692 Merge branch 'devel' of github.com:hansenlab/yamss... 6b9ae7a Change example 06fc7c5 Script for making example 8328c55 Fix typo in filename be7aef2 fixing digest stuff 7a91b02 Rollback minScan in rtAlign bba7c9b Merge branch 'devel' of github.com:hansenlab/yamss... fdee071 Rollback minScan ba306dd Rollback minScan 5fc2ccd rerun digesting 21c6c27 new message d2fb253 cmsProc object from vignette a598069 Merge branch 'devel' of github.com:hansenlab/yamss... 1ddb465 expanding unit testing 9c02000 removed 1:x style of seq creation 5f29b31 updating digest with 3 decimal points fb03f13 Remove vignette object ecbb166 Merge branch 'devel' of github.com:hansenlab/yamss... d132832 compress = xz for cmsRawExample d6b9d62 importing colData<- from SummarizedExperiment 0aed9b2 updating digest 244ee76 depends on newer R 10c99a9 Version bump c144cba Merge branch 'devel'

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20161013143100.html

bioc-issue-bot commented 8 years ago

Received a valid push; starting a build. Commits are:

9c3e72a Make helper functions for readability of nested ap... 60cbaf7 Merge branch 'devel' of github.com:hansenlab/yamss... abeeb8e changed bakedpi example f962379 cleanup of some examples ea18976 Update some examples to use cmsRawExample 0c4bcfd Fix merge conflicts 3d34a88 using peakBounds 9398fe5 fixes f2e65b7 Change slicepi example - getting cutoff ffcb9e8 Change slicepi example - getting cutoff e79905f updating digesting eff8bb2 Bump version bfd2fe8 Merge branch 'devel' f4a42ac Merge branch 'master' of github.com:hansenlab/yams...

bioc-issue-bot commented 8 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/yamss_buildreport_20161013175703.html

kasperdanielhansen commented 8 years ago

We have now addressed the issues in the review report. Here is our detailed response

First we quote the review with numbers added for reference.

DESCRIPTION

    1. It seems to me like limma should be in the Imports field as you only use a few functions from the package.

Unit Tests

    1. Please add some unit tests at least to your main functions.

Methods

    1. Many functions that you're writing include stopifnot(myclass). Perhaps you intend to write methods that dispatch on your class.

Grammar

    1. Nesting of lapply, sapply - It is not good for readability, and may be avoidable. Perhaps you can pass additional arguments to the first lapply call (i.e., lapply(seq_along(object), function (i, newObj) { ... }, newObj = object))
    1. Try to avoid call sequences such as rbind > lapply > cbind > rbind > lapply > cbind.
    1. Use of 1:length(number) is not robust when the number variable is 0L. Try using seq_len(number) instead.
    1. Minor: Avoid using ifelse for scalar evaluation, just use if ( ) else ( )
    1. You shouldn't be using @ to access slots in your class. Please use accessors and replacers for this.
    1. Avoid using argument names that are functions such as by.

Vignettes

    1. Looking good.

Documentation

    1. Overall, looks helpful.
    1. You include an example with \dontrun. It looks runnable.

Comments

  1. Done (Limma moved to Imports)
  2. We now unit tests all main function in the package, specifically bakedpi() and slicepi().
  3. We have not changed anything here. We don't believe in creating a new generic when we only would write a single method for the generic. Instead we think of our stopifnot() as argument checking.
  4. We have improved readability by naming the functions.
  5. See 4.
  6. Done (no use of 1:length(x) anymore in code).
  7. Done (no use of ifelse() in code outside of data.table functions where we need it.
  8. Done (no use of @ in code outside of accessors).
  9. Done (by is no longer an argument name).
  10. Thanks (no change).
  11. Thanks (no change).
  12. We now have a small dataset included in the package, so we have removed \dontrun{}. Several examples have been reworked.
LiNk-NY commented 8 years ago

Hi @kasperdanielhansen and @lmyint,

Thank you for submitting to Bioconductor!

Regards, Marcel

kasperdanielhansen commented 8 years ago

Any feedback on our response?

kasperdanielhansen commented 8 years ago

Guess we're accepted based on the label. Thanks, that's great.

LiNk-NY commented 8 years ago

Hi @kasperdanielhansen and @lmyint, Thank you addressing the points in the review. I labeled your package as accepted. Thank you for submitting to Bioconductor!

Regards, Marcel