Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

SegmentedCellExperiment #1370

Closed ellispatrick closed 4 years ago

ellispatrick commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Hi @ellispatrick

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: SegmentedCellExperiment
Type: Package
Title: S4 Class for segmented spatial 'omics data
Version: 0.0.1
Author: Nicolas Canete and Ellis Patrick
Maintainer: Ellis Patrick <ellis.patrick@sydney.edu.au>
Description: This package describes a S4 class for storing data from segmented imaging cytometry and spatial omics data.
License: GPL (>=2)
biocViews: 
    SingleCell, CellBasedAssays
Encoding: UTF-8
LazyData: true
VignetteBuilder: knitr
Imports: S4Vectors, BiocGenerics, methods, ggplot2, IRanges
Suggests: BiocStyle, knitr, rmarkdown
RoxygenNote: 7.0.2

Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.

bioc-issue-bot commented 4 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

mtmorgan commented 4 years ago

Please remove the build product (.R, .html) from the vignettes directory.

I'm very surprised that this does not extend the SummarizedExperiment or SingleCellExperiment class.

Please provide a more extensive Description: summarizing the unique aspects of your contribution.

I'll provide more extensive comments when these issues (and other issues identified by the automatic checks) are addressed.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

ellispatrick commented 4 years ago

Thanks Martin.

I’m sure there have been many long discussion on the best ways to store scRNA-seq data and multi-omic experiments in general. I’ll give you brain dump of my naïve thought process which hopefully provides a little bit of insight into why I wanted a simple data structure for my personal analysis.

For the imaging datasets I have been working with I typically have 100’s of millions of cells. For this reason alone, the statistician in me really wants to use this as a justification to revert to having the observations (cells) as rows and features on the columns. The structure of a flowSet becomes much more appealing and much more natural when arriving to this data structure via imaging mass cytometry. Most of the people in my institute store their data in fcs files and do basic analysis in flowjo anyway. Also, given that I can have 100’s of images, I really wanted the data to be split by image as much as by cell and so again the flowSet or ncdfFlow seemed more appropriate. So the real question becomes less of why not extend SummarizedExperiment or SingleCellExperiment class and more why not extend flowSet or ncdfFlow. There is not a strong reason but it is primarily because I wanted something with a simpler structure that I would be able to extend to eventually be able to store the raw images themselves, masks and polygons. Conceptually I also wanted it to feel “matrix” like and retain the feel of a cellProfiler or ilastik output. Having everything in a big DataFrame with rows being images and columns being modality just made it much easier for me to work with, and think about, my data. However, I’m sure it is not very efficient.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

e9ed987 Changed version number to 0.99.0

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ABNORMAL". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

36145eb version bump

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

7c85cc9 gitignore .Rproj 02537c1 added Rproj to gitignore

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

e2fc726 version bump

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

300249b version bump

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

d28a9a5 Fixed plotting error 3029898 fixed issue with ggplot2 by adding variables to g...

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

a75ebe8 version bump

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

d756caa Change plot manual efe0c16 Fixed more inconsistencies in plot

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

vjcitn commented 4 years ago

I built the vignette, which does a good job of illustrating the capabilities. I am not heavily invested in flow so my comments should carry less weight than those of an active practitioner. SegmentedCellExperiment extends DataFrame and defines methods that take advantage of DataFrame nesting to represent elements of cell-based experiments that are commonly encountered.

The show method produces

> cellExp
SegmentedCellExperiment with 2 rows and 6 columns
                                                location
                                    <SplitDataFrameList>
1 1:1:0.367243:...,2:2:0.613818:...,3:3:0.335930:...,...
2 6:1:0.831271:...,7:2:0.130342:...,8:3:0.217023:...,...
                                                       intensity
                                            <SplitDataFrameList>
1 0.1281166:0.10817130,0.1503157:0.17161615,0.0796545:0.00631104
2    0.1707524:0.0975236,0.0697529:0.0892760,0.0890765:0.0190400
                                                   morphology
                                         <SplitDataFrameList>
1    0.489959:1.3913000,3.601540:0.3670445,0.243015:0.0149731
2 0.498723:0.03411949,0.818412:2.25542506,0.932012:0.00920492
             phenotype               images                masks
  <SplitDataFrameList> <SplitDataFrameList> <SplitDataFrameList>
1            1:21:dead                                          
2           2:81:alive                                          

which is challenging to read and interpret. A show method should be defined to give users a quick sense of content. This particular instance involves information on 10 cells derived from two images. Those basic facts should be presented in the show method.

A convention seems to be emerging in which classes with name Experiment extend SummarizedExperiment, where the basic interpretation is that G features recorded on N samples are retrievable using the assay() method, and R characteristics recorded on N samples are retrievable using the colData() method. It seems to me that the data handled by this package could at least in part be organized in a way that conforms to this scheme, and the parts that cannot could be provided as extensions to a species of Experiment, probably SingleCellExperiment. We have found that adoption of these conventions fosters interoperability and package reuse. Enhancements to the underlying infrastructure often provide benefits to packages defining inheriting classes "for free".

ellispatrick commented 4 years ago

Thanks Vince. I'll include this information. The wording of the description sounds great and show method sounds very sensible.

Just to clarify, it sounds like the name of the class is misleading as it is not extending SummarisedExperiment? I'll spend some time thinking of a better name, open to suggestions.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

df54454 Customised show

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

ellispatrick commented 4 years ago

Feedback was all great. I am going to change the name and maybe roll into the analysis package which calls it.