jaybee84 / ml-in-rd

Manuscript for perspective on machine learning in rare disease
Other
2 stars 1 forks source link

Figure: MultiPLIER / DeepProfile - Rare Disease Putting It Together #108

Closed cgreene closed 3 years ago

cgreene commented 4 years ago

It'd be grand to have a figure that covers perhaps both of these topics showing how they incorporate regularization (PLIER constraints, VAE's loss function w/ KL-divergence), & prior data (MultiPLIER/DeepProfile)/knowledge(MultiPLIER).

jaclyn-taroni commented 4 years ago

I'm going to assign myself to this. I'll plan to put together at least an initial sketch in tandem with #111.

jaclyn-taroni commented 3 years ago

My plan for this is going to be to make a hand-drawn sketch to post here and also post the notes about the main takeaways we hope to show with the figure.

jaclyn-taroni commented 3 years ago

The text related to this figure is in the open pull request (#120). It can be viewed here: https://github.com/jaybee84/ml-in-rd/blob/jaclyn-taroni/111-all-together-now/content/06.multiple-approaches-required.md. I would recommend reading that as background.

Concepts covered in this section:

In addition to reducing model complexity as discussed earlier, regularization can aid in useful representations by putting some constraints on what is learned by a model.

Representation learning is the process of learning features from raw data, where a feature is an individual variable.

Representation learning tends to be data-intensive (i.e. many samples are required) and thus may seem to aggravate the curse of dimensionality. But representation learning when applied to learn low dimensional patterns from large datasets and then applying those patterns to smaller but related datasets can be a powerful tool for dimensionality reduction. In the later sections of this perspective, we will discuss this method of leveraging large datasets to reduce dimensionality in smaller datasets, also known as feature-representation-transfer.

jaclyn-taroni commented 3 years ago

As with #107, I'm adding a sketch intended to communicate the main takeaways, but not necessarily guide the generation of the final figure.

I think the big takeaway for this figure should be that the concepts/approaches discussed throughout don't get used in isolation; for things to work in this domain, you need multiple techniques!

multiple-approaches

I'm again wary about getting into the details of either method in this figure, but I think showing a bunch of -> is also not entirely accurate. I think there's room for collapsing this to a single panel where we highlight differences between the approaches (e.g., what's the model, training data, what do we do with the output) perhaps through color scheme? We could even go more generic than that and not highlight the individual model differences in service of the main takeaway from above ☝️

the concepts/approaches discussed throughout don't get used in isolation; for things to work in this domain, you need multiple techniques!

allaway commented 3 years ago

I like this sketch a lot. Maybe it is because I am more familiar with MultiPLIER and similar techniques than our target audience, but I think that this is an appropriate level of simplicity. I really like that this highlights how different inputs/methods (priors, regularization, transfer learning, etc) are used in practice in a couple of examples. I do not think that this needs to be made more generic, but that's just me.

dvenprasad commented 3 years ago

Deep Profile

deep-profile

MultiPLIER multiplier

dvenprasad commented 3 years ago

Updated MultiPLIER and DeepProfile based on comments from Monday's call.

multiplier deep-profile

jaybee84 commented 3 years ago

@dvenprasad this looks good! Can we replace "Cell Line Data" with "In-vitro data"?