Figure: Prior Knowledge

cgreene commented 4 years ago

It'd be grand to get a figure on how prior knowledge/data can be useful, especially for rare diseases.

jaclyn-taroni commented 3 years ago

@dvenprasad I am including here the background info for this figure, including any appropriate links I came across. I am also going to quote what I think are the most relevant passages from the manuscript here for your reference.

The relevant section of the manuscript covers the following topics:

Knowledge graphs - we specifically talk about graphs that are comprised of multiple relationship types

Knowledge graphs integrate related-but-different data types, creating a rich data source. Examples of public biomedical knowledge graphs and frameworks that could be useful in rare disease include the Monarch Graph Database[doi:10.1093/nar/gkw1128], hetionet[doi:10.7554/eLife.26726], PheKnowLator[doi:10.1101/2020.04.30.071407], and the Global Network of Biomedical Relationships[doi:10.1093/bioinformatics/bty114]. These graphs connect information like genetic, functional, chemical, clinical, and ontological data to enable the exploration of relationships of data with disease phenotypes...

Transfer learning - we're mostly focused on feature-representation-transfer (think MultiPLIER)

Transfer learning is an approach where a model trained for one task or domain (source domain) is applied to another, typically related task or domain (target domain). Transfer learning can be supervised (one or both of the source and target domains have labels), or unsupervised (both domains are unlabeled). Though there are multiple types of transfer learning, in a later section we will focus in-depth on feature-representation-transfer. Feature-representation-transfer approaches learn representations from the source domain and apply them to a target domain [doi:10.1109/TKDE.2009.191].

Multitask learning

Multitask learning is an approach where classifiers are learned for related individual predictions (tasks) at the same time using a shared representation [doi:10.1023/A:1007379606734].

Multitask neural networks (which predict multiple tasks simultaneously) are thought to improve performance over single task models by learning a shared representation, effectively being exposed to more training data than single task models [doi:10.1023/A:1007379606734; arxiv:1606.08793].

Few-shot learning (also called one-shot learning depending on the number of examples)

Few-shot learning is the generalization of a model trained on related tasks to a new task with limited labeled data (e.g., the detection of a patient with a rare disease from a low number of examples of that rare disease).

one-shot or few-shot learning relies on using prior knowledge to generalize to new prediction tasks where there are a low number of examples [@arxiv:1904.05046v3], where a distance metric is learned from input data and used to compare new examples for prediction [@doi:10.1021/acscentsci.6b00367].

jaclyn-taroni commented 3 years ago

This section of the manuscript is divided up into two headings: Knowledge graphs and Transfer, multitask, and few-shot learning.

I think the main takeaways are somewhat covered above, but I also drew something up for each of those sections that I will include below. These drawings are not necessarily intended to guide development of this figure, but instead to communicate the takeaways. I'll post over on #108 shortly, but did want to note that I'm not sure we need this figure and the "putting it all together" figure tracked on #108 which would cover two specific studies using transfer learning in rare diseases (DeepProfile and MultiPLIER) that unify some of the other concepts introduced in the manuscript. As a result, I'm not totally sure what the main takeaway message is yet.

For context, I'm using a "database" as my representation of a model here because that's consistent with the tentative sketch of the statistical technique figure https://github.com/jaybee84/ml-in-rd/issues/106#issuecomment-707393718.

Knowledge graphs

knowledge graphs

I'm worried that this is putting the cart before the horse (the model before the what the model is supposed to be doing) ☝️ in its current form.

Transfer, multitask, and few-shot learning

Image from iOS (41)

Hopefully this figure makes it somewhat clear why we would put these approaches under the same header! What I didn't include was information about supervised vs. unsupervised tasks, but I think that might muddy things a bit 💭 I'm also very wary of including anything in this figure that implicitly references a specific neural network architecture.

dvenprasad commented 3 years ago

Transfer Learning

transfer-learning

Few-shot Learning

few-shot-learning

allaway commented 3 years ago

I think these figures are looking great. One comment about similarity metrics: they are often (maybe always?) on a 0 to 1 scale, and are either representative of similarity or distance, not both (e.g. I don't think that the center should be the "origin" for the similarity bars). Could also use a stylized heatmap representation of distance.

Also, thinking about these two concepts (transfer learning and few-shot learning) I think a key distinction that we should highlight in this feature is that transfer learning is leveraging the knowledge of large-complex datasets (and may not have the disease you're studying represented in it at all) to perform a prediction (or related) task, while few-shot or one-shot learning is using the one to few examples of the disease(s) you are studying to perform a prediction (or related) task.

jaclyn-taroni commented 3 years ago

I think some of the imagery for few shot learning that I've found helpful usually gives you some kind of intuition about why this is a possible strategy. (Granted it helps that they are from the natural image domain!) Here are some examples:

Problem setting figure here: https://medium.com/sap-machine-learning-research/deep-few-shot-learning-a1caa289f18
Figure 2 from Few-Shot Learning with Localization in Realistic Settings in https://towardsdatascience.com/few-shot-learning-in-cvpr19-6c6892fc8c5
And to a lesser extent re: intuition https://arxiv.org/pdf/1904.04232.pdf

This one is tricky because we don't want to talk about architectures, etc. – I'm wondering if we even need to get into the part about similarity? I'm not sure it's necessary.

I also know we're trying to keep things at a high-level of abstraction in many cases, but is there a place for having these figures be a little more specific (e.g., we use some kind of representation of medical images)?

dvenprasad commented 3 years ago

Transfer Learning

The initial dataset still has some of the classes in the test dataset but low number of sample (see light green and purple)
Using the horizontal rectangle box to indicate rare disease data (lots of features, few samples)

transfer-learning