hemberg-lab / SC3

A tool for the unsupervised clustering of cells from single cell RNA-Seq experiments
http://bioconductor.org/packages/SC3
GNU General Public License v3.0
119 stars 55 forks source link

[suggestion]maybe it could be better just using the SingleCellExperiment class #43

Closed LuyiTian closed 6 years ago

LuyiTian commented 6 years ago

Hi Vladimir, I have used your SC3 in several projects and it is great. I have a question regarding your SC3class. I look at the code and it just adds a sc3 slot in the object. Have you think about just using the SingleCellExperiment class directly instead of creating a new class. You can put the sc3 slot into the int_metadata and access it through sce@int_metadata@sc3. This is what I did for my scPipe package. I think the SingleCellExperiment also put package information into int_metadata.

Another thing that confuses me is the feature_symbol column in the rowData. I did not have that column at first and it gives me an error when I plot marker genes. The error message is not obvious so it took me some time to figure out I don't have this column. I think it could be better to add a check when you use the feature_symbol and use rowname as default when it didn't exist.

wikiselev commented 6 years ago

Hi Luyi, thanks for your suggestion! I've introduced SingleCellExperiment just recently and I was not sure what the best option would be to store the sc3 slot. In my approach I think I followed Aaron's @LTLA suggestion: inherited SingleCellExperiment and created my own class. I had no idea that int_metadata can be used to store something. Maybe Aaron (@LTLA) can clarify? If it's ok, then I will surely use your suggestion instead of creating a new class.

wikiselev commented 6 years ago

Regarding feature_symbol column - I forgot to document it and add an error handling. But I found this column is very useful and require it now in all our packages (scmap, scfind). The name of it comes from scater - if you rungetBMFeatureAnnos() function on your SCE object it will write gene names to feature_column automatically. If you don't run it all our packages will still use it as a default for gene names.

LTLA commented 6 years ago

The int_metadata slot is used to store metadata that is required to check the integrity of the SingleCellExperiment object. We do not export any public getters/setters for this slot, and we do not guarantee that it will be stable (or that it will even continue to exist) in future package versions.

All the int_* slots are the closest that S4 gets to private members in an OO framework. We have a gentleman's agreement to not access these slots via @, and to respect the (un)availability of getters/setters. Users and dependent functions should use the public-facing metadata instead.

wikiselev commented 6 years ago

Thanks for your clarification @LTLA ! So, what I've done is correct? (created my own class with an additional sc3 slot which is inherited from SingleCellExperiment)

LTLA commented 6 years ago

Yes. Alternatively, you could use a SingleCellExperiment and put the sc3 data into the public metadata, if it's purely for data storage (i.e., no need for special behaviour upon subsetting/combining).

wikiselev commented 6 years ago

Thanks @LTLA ! Ok, then I will use the public metadata slot instead. Hope its name won't change in the future! @LuyiTian I will try to finish with it before the next Bioconductor release, so that it will go to the release branch in November.

LuyiTian commented 6 years ago

@LTLA Ok so I have been using it in a wrong way. I will put all the stuff to metadata and keep the private int_* clean.