Benjamin-Lee / deep-rules

Ten Quick Tips for Deep Learning in Biology
226 stars 46 forks source link

Overall discussion for Tip 8 #249

Open SiminaB opened 3 years ago

SiminaB commented 3 years ago

This is to discuss outstanding issues for Tip 8: Your DL models can be more transparent.

SiminaB commented 3 years ago

Model interpretation is an open, active area of research. It is becoming more feasible to interpret models with many parameters and non-linear relationships, but in many cases simpler models remain substantially easier to interpret than more complex ones. When deciding on a machine learning approach and model architecture, consider an interpretability versus accuracy tradeoff. A challenge in considering this tradeoff is that the extent to which one trades interpretability for accuracy depends on the problem itself. When the features provided to the model are already highly relevant to the task at hand, a simpler, interpretable model that gives up only a little performance when compared to a very complex one more useful in many settings. On the other hand, if features must be combined in complex ways to be meaningful for the task, the performance difference of a model capable of capturing that structure may outweigh the interpretability costs. An appropriate choice can only be made after careful consideration, which often includes estimating the performance of a simple, linear model that serves as a baseline. In cases where models are learned from high-throughput datasets, a small subset of features in the dataset may be strongly correlated with the complex combination of the larger feature set defined from the deep learning model. In this case, this more limited number of features can themselves be used in the subsequent simplified model to further enhance interpretability of the model. This feature reduction can be essential to defining biomarker panels that enable clinical applications.