rformassspectrometry / QFeatures

Quantitative features for mass spectrometry data
https://RforMassSpectrometry.github.io/QFeatures/
25 stars 7 forks source link

accessor for "sets/experiments" for easier piping #197

Open tavareshugo opened 11 months ago

tavareshugo commented 11 months ago

Related to #194

If you decide to use "sets" to refer to experiments within QFeatures, would you then consider creating something like a sets() function (in effect just a synonym for experiments())?

I'm coming from a teaching perspective. I was thinking that if we're explaining that you can have different "sets" within a QFeatures object, then it might be easier to remember that you can get a list of them using the sets() function. It might even be nice to have a set() function (analogous to how there is assays() and assay()) as it would allow the piping fans to do something like:

qf |>
  set("psms_filtered") |> 
  etc...

rather than qf[["psms_filtered"]] |> etc...

Similarly, I guess there could be a setNames() function (again, analogous to assayNames()).

Just an idea, I'm not super familiar with QFeatures (or MultiAssayExperiments for that matter), so maybe this would cause confusion/redundancy with existing accessors.

lgatto commented 11 months ago

Thank you @tavareshugo for these thoughts:

  1. I wouldn't be goo keen to add new functions to address this, as I fear it would make things more complicated. set() would be too general anyway.
  2. I do agree that assay(qf, "psms_filtered") could be confusing, if it returned the set/element of the QFeatures object qf, but it actually returns the assay in the set/element "psms_filtered". So that maybe isn't too bad.
  3. If we chose to call these experiments (rather than set), then we would need to check if the MultiAssayExperiment team would consider a pull request to support experiments(qf, "psms_filtered") (or experiment(qf, "psms_filtered")), which could then be used along the lines you suggest.
  4. setNames() exists, but to actually set element names. Here, simply using names() works.

In the light of this, and especially point 3, given that 2 above works, and that I don't mind using qf[["psms_filtered"]], I am not ready to make any decision at this point. But happy to hear what others think.

lgatto commented 11 months ago

Ping @cvanderaa @samgregoire

tavareshugo commented 11 months ago

agree with your points above. I guess then I'd go with your point 3 and name these "experiments".

(we're just having a debrief from a proteomics course and we seem to agree here @lmsimp @csdaw) 😸

lgatto commented 11 months ago

What I don't like with experiment(s) is that it is misleading or wrong.

The different SummarizedExperiment (or SingleCellExperiment) instances in a QFeatures object are all part of the same experiment - they have all been designed together (think experimental design) and acquired or combined into one QFeatures object as part of one scientific question. They could originate from different experiments (say different labs), but are still part of one new scientific question/experiment.

To me, calling them experiments gives the idea that they don't have much to do with each other, that they aren't part of the same design. And the concept of experimental design is one that isn't highlighted enough.

May be we could call them acquisition? (I'm not suggesting new accessors acquisition() or acquisitions()!)

lgatto commented 11 months ago

(Aquisition is more of a joke, here... )

tavareshugo commented 11 months ago

Yes, I see your point. And when you aggregate features, they are in fact literally the same experiment, just a different level of aggregation.

In any case, renaming from "assays" to something else will be great. Feel free to close this issue in the meanwhile.

lgatto commented 11 months ago

My suggestion would be to stick with set, given that it's what we used here. This is something that will need to be fixed in the documentation.

We could consider having a qfset(qf, i) function that does the same thing as qf[[i]].

Note that this works with %>% (but not with |>), but I think we can agree that this is not our preferred approach

> feat3 %>% `[[`("psms2")
class: SummarizedExperiment 
dim: 8 2 
metadata(0):
assays(1): ''
rownames(8): PSM3 PSM4 ... PSM9 PSM10
rowData names(5): Sequence Protein Var location pval
colnames(2): Sample3 Sample4
colData names(0):