Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

matter #146

Closed kuwisdelu closed 7 years ago

kuwisdelu commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @kuwisdelu

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: matter
Type: Package
Title: A framework for rapid prototyping with binary data on disk
Version: 0.99
Date: 2016-9-25
Author: Kylie A. Bemis <k.bemis@northeastern.edu>
Maintainer: Kylie A. Bemis <k.bemis@northeastern.edu>
Description: Memory-efficient reading, writing, and manipulation of
    structured binary data on disk as vectors, matrices, and arrays.
    This package is designed to be used as a back-end for Cardinal
    for working with high-resolution mass spectrometry imaging data.
License: Artistic-2.0
Depends: methods, stats, biglm
Imports: BiocGenerics, utils
Suggests: BiocStyle, Cardinal, irlba, testthat
biocViews: Software, Infrastructure
URL: http://www.cardinalmsi.org
lawremi commented 7 years ago

I guess I'm surprised that e.g. the ff package could not point at arbitrary file offsets and thus interface with custom file formats like the ones mentioned in the vignette.

Btw, many of the generics are going to conflict with those in e.g. S4Vectors. We should probably push many of those up to BiocGenerics. Right now, they're using the default signature, which is not generally desirable.

kuwisdelu commented 7 years ago

Yes, ff and bigmemory are flexible in many ways, but not so much with custom file formats or datasets spanning multiple files, as far as I can tell, without conversion.

I do not currently get any obvious conflicts when loading matter with S4Vectors, but I am happy to switch to relying on whichever existing generics are advisable.

lawremi commented 7 years ago

Well, there's at least the mean() generic with IRanges.

bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160926070911.html

lshep commented 7 years ago

Thank you for your submission. Please address the issues raised in the single package builder output. Once you have a clean (i.e no errors) build and check, a full review will begin. Please also try and address any warnings that are occurring as many of these will need to be corrected before acceptance. Lori

bioc-issue-bot commented 7 years ago

We only start builds when the Version field in the DESCRIPTION file is incremented. For example, by changing

Version: 0.99.0

to

Version 0.99.1

If you did not intend to start a build, you don't need to do anything. If you did want to start a build, increment the Version: field and try again.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

20a090e v0.99.0 -> v0.99.1

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160926122636.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

1e05354 cleaned up double assignments in C++ code

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160926125450.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

1baaa7a added S4Vectors imports, cleaned up generics

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160927133010.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

da1a1b4 added native routine registration

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160927160504.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

ea1ed02 fixed native routine registration

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20160927171945.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

0a8331e improved NA handling in coercion and summary stats 2887e3d changed R/C_cast to coerce_cast; fixed NAs in matr...

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161001142904.html

lawremi commented 7 years ago

How easy would it be to generalize this, at least the API, to support URLs, not just file paths? I could imagine use cases where the data are stored remotely, through an API that supports random access, like BAM files and HTTP's support for file offsets. Or is your expectation that remote resources would be mounted via e.g. fuse drivers? I'm not suggesting implementation, just abstraction at the API level. An error would be thrown for anything that is not an ordinary path or file scheme.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

f7908fa isnan -> std::isnan

kuwisdelu commented 7 years ago

"How easy would it be to generalize this, at least the API, to support URLs, not just file paths?"

Hi Michael,

It shouldn't be a problem on the API side in R, it would just mean the filepaths can be URLs. Files aren't touched at the R level. The implementation details would be on the C/C++ side in terms of portably opening files over HTTP. It's something I'd like to support in the future, but not right now.

I also want to generalize the API a little bit more to support (in the future) pointing at in-memory R matrices, so it would be possible to, say, cbind() an already-loaded R matrix with a matrix on disk.

bioc-issue-bot commented 7 years ago

We only start builds when the Version field in the DESCRIPTION file is incremented. For example, by changing

Version: 0.99.0

to

Version 0.99.1

If you did not intend to start a build, you don't need to do anything. If you did want to start a build, increment the Version: field and try again.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161001145002.html

lshep commented 7 years ago

@lawremi I would appreciate any further comments you had regarding generics and Rcode for this package.

lawremi commented 7 years ago

The issue remains that we need more generics in BiocGenerics, in order to avoid conflict. There's no reason this package should depend on S4Vectors. I can work to move the generics. It has to happen gradually though, because if I try to do more than a couple at once, I fall asleep.

lawremi commented 7 years ago

Another thing is that this package should add S3 methods for all external (non-primitive) S3 generics. That way, code that does not see the S4 generics can still use the abstraction.

For example, for t() this would be:

t.matter_vec <- function(x) t(x) # we see the "t" S4 generic
kuwisdelu commented 7 years ago

I am happy to change the generics dependencies whenever BiocGenerics is updated. Just let me know.

lawremi commented 7 years ago

Some more things:

  1. My point about the URLs was that you might want to rethink the name of the filepaths()accessor.
  2. rowSd() (and others) are singular, while rowMeans() is plural. Would be good to make them consistent. But if you make them plural, they will conflict with the functions from the widely used matrixStats package. Ideally, both packages could coexist, but that would require sharing generic definitions. I think @hpages had suggested BiocGenerics depending on matrixStats. That would be one way to go. A lot of matrixStats would be better off in the base package though.
  3. It seems that DelayedArray from HDF5Array would want a similar API for setting the chunk size to use for incremental processing. I don't think it has one, but ideally they would end up with the same API.
  4. The matter data structure would be useful as a backend for DelayedArray. Can it be used as one already?
  5. This could probably use the subscript infrastructure from S4Vectors, e.g., normalizeSingleBracketSubscript().
  6. Please do not use . in function names unless they are S3 methods.
  7. There are some switch() statements on classes that could be replaced with dispatch.
  8. Maybe I am reading this wrong, but it looks like readMatrixElements() always returns an ordinary R matrix. For some reason, it's called by [()except when drop=NA. Is the intent that the matrix is materialized by default by [()? I would have thought only as.matrix() would do that. You could support drop=TRUE with _mattervec, right? The behavior should at least be better documented.
  9. What does drop.matrix() do differently from drop()? I guess it has a more generic (R) implementation, but isn't the data always just an R matrix?
  10. You should not use stop() in validity functions. Rather, all error messages should be concatenated into a returned character vector.

This is not comprehensive. I just stopped arbitrarily at 10 points.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

6869bd4 in class atoms switched file_id and datamode to in... 361d59a more comprehensive error messages

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161003184132.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

f4c1302 added delayed scaling and centering 8994d94 added scale method, renamed sd/var methods af40fec added prcomp method, fixed NAMESPACE issues 5915b0e changes to generics and slot names, bug fix for co... c3a34ab updated vignette to use prcomp in PCA example 59e06e4 fixed missing documentation entries a114602 minor edit to prcomp warning msg

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161005230833.html

lshep commented 7 years ago

Hello, Thank you for making the changes recommended thus far. Some additional comments to address:

General

Vignette

source("https://bioconductor.org/biocLite.R")
biocLite("matter")

Description

Cheers Lori

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

8357e27 saved vignette results to matter_ex.rda to cut bui... cd986a9 added biocLite install to vignette + typos 3f29782 added adata method, updated docs aa98ee9 v.0.99.10 version bump, changed internal function ... 2b949ee reverted some S3 method names 0ba5d7d vignette corrections

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161011221503.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

f3a5326 updated documentation details

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/matter_buildreport_20161011224050.html

kuwisdelu commented 7 years ago

Thank you for the review.

I have addressed the above items in the last commits.

Some notes from my end:

-I have switched to using saved results to cut down build time. If build time is a concern, I suggest adding this to the package guidelines. Requirements for R CMD build do not currently give time constraints, only size constraints, while time constraints are instead given only for R CMD check. I originally wrote the vignette with these specific requirements in mind.

-I have changed internal function names to avoid dots in non-S3 methods, even though none of these functions are exported to the user. If this is an official requirement, I suggest adding it to the package guidelines, and to BiocCheck, which would be helpful in identifying them.

-The suggested biocViews do not apply (in my opinion), so I did not add them. Although matter is designed to be used by Cardinal downstream for MS imaging analysis, the package itself does not provide functionality specific to mass spectrometry or MS imaging analysis.

Thanks, Kylie

lshep commented 7 years ago

Thank you for making these changes. I will recommend be accepted. You should get svn instructions via email in the next couple of days.

Thank you also for your notes/input. I will discuss with the rest of the team and update accordingly if applicable.

kuwisdelu commented 7 years ago

Thank you!

My svn username is k.bemis.