drisso / archive-SingleCellExperiment

This is the archived version of SingleCellExperiment with the history before Bioc submission.
8 stars 2 forks source link

columns in rowData and colData to filter matrix #8

Open lpantano opened 7 years ago

lpantano commented 7 years ago

Hi,

As discussed during BioC, I mentioned it would be cool to have a column in rowData/colData to filter bad samples/features.

I can do the pull request, but I wanted to discuss first here.

Not sure if this should be in int_metadata/int_rowdata if we wanted to make if invisible to the user and use methods to change those values or directly use colData/rowData.

As well, we can modify the counts method proposed here: https://github.com/drisso/SingleCellExperiment/issues/7 by filtering first samples/features before returning the matrix.

Let me know your thoughts and I can work on that.

cheers

LTLA commented 7 years ago

I must say that I had a similar idea with #5. I was hoping to make my life easier by allowing users (and functions) to automatically filter out uninteresting features. However, when I considered actually using it in a workflow, I realized that its surprise factor greatly outweighted its usefulness:

So, that's why I closed that particular issue. I guess we could add a QC field, but I'm not sure how useful that would be, as the vast majority of analyses would just filter out low QC things right at the start. So you'd just end up with a vector full of TRUEs, which is not particularly useful.

lpantano commented 7 years ago

Hi Aaron,

Thanks for the thoughts. I get the point, and I agree with you. The main reason I mentioned that is because (in our case at least) we go back many times before filtering the data. To do so, we make many figures comparing the filtering out/in cells/genes in pairs. I thought that having a way to have all in a object and be able to make functions that can get this information and plot or whatever the good/bad set can make things easier.

I recognize that maybe this is for a package that can work with this class and makes methods for this specific porpoises. So I am ok with not having directly here, but as it is the object, can columns/rows in the internal metadata be created with customs names?

I guess that if not, is not a super big issue, because new methods can be added and not need to be directly in this package.

So, anyway, thanks for the discussion, maybe we need another package only for this purpose :P

Feel free to close the issue!