bioFAM / MOFA2

Multi-Omics Factor Analysis
https://biofam.github.io/MOFA2/
GNU Lesser General Public License v3.0
283 stars 49 forks source link

Extended support for MultiAssayExperiment #133

Open RiboRings opened 9 months ago

RiboRings commented 9 months ago

Hi!

The current method for MultiAssayExperiments requires some manual work which the end user may not be familiar with. As in this example, after transforming assays you also need to:

  1. remove unnecessary assays (like assay(mae[[1]], "counts") <- NULL)
  2. run create_mofa_from_MultiAssayExperiment
  3. Get default options and modify them if necessary (like model_opts$num_factors <- 5)
  4. run prepare_mofa

As an alternative to performing those 4 steps separately, I propose to include them in a wrapper function, whose result can be directly passed to run_mofa.

I have already prepared a working function for this purpose, and if you see value in this, I could open a pull request to MOFA2.

Thank you and cheers!

antagomir commented 9 months ago

I'm just a MOFA user but I can see that the create_mofa function facilitates the conversion of many different input formats into the MOFA format. In principle, it would be cool to just do run_mofa for any suitable input object. Perhaps the problem is that this would leads to a too high number of different function arguments, to accommodate all possible input formats? And if such wrapper would be created, it should probably support all accepted input formats. I dont see immediately how feasible that might be.

However thinking about the following:

  1. MultiAssayExperiment (MAE) has become a key Bioconductor class for multi-source data integration; could MOFA use this as the standard input, instead of its own custom mofa class? Then prepare_mofa (or just run_mofa?) would work for any data object that has converter (or can be converted) into MAE. The change might be relatively limited since the package functions could still internally rely on the mofa class and just the function API could be updated to MAE.

  2. Indeed it would be useful if the user does not have to remove assays in order to run MOFA, if this is the case. The functions could instead allow users choose the assays that they like include in the MOFA analysis?

  3. It could be handy to be able to define the necessary parameters in prepare_mofa or run_mofa directly, instead of modifying the model options

rargelaguet commented 6 months ago

Hi @RiboRings and @antagomir thanks a lot for your feedback. I agree that there is a benefit in creating a wrapper that takes all necessary parameters in data_options, model_options, etc. so that one could run MOFA with potentially one line of code. As @antagomir pointed out, there are so many different input formats for multi-omics data (MAE, Seurat, list of data matrices, etc.) and so many parameters that for clarity we decided to split them into different functions. But I am happy to incorporate a simple one-liner (mofa2(...)?) that accommodates all possible input formats and uses default values for data, model and training settings. @RiboRings could you share the code or create a pull request? Thank you!

antagomir commented 6 months ago

Great! Indeed, @RiboRings could you have a look at this?

RiboRings commented 6 months ago

@rargelaguet and @antagomir, thank you for your inputs. I will open a pull request to share and discuss this feature.

antagomir commented 6 months ago

Perhaps first describe how you plan to implement it, so commenting is possible prior to implementation efforts?

RiboRings commented 5 months ago

I opened a pull request related to this issue (https://github.com/bioFAM/MOFA2/pull/144). The code was nearly ready and it didn't require much implementation effort. I'd be glad to hear your comments and adjust the code accordingly.

antagomir commented 3 months ago

The PR is open - I am curiously looking fwd how this might proceed.