greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 270 forks source link

An exact transformation of convolutional kernels applied directly to DNA/RNA sequences #578

Open agitter opened 7 years ago

agitter commented 7 years ago

https://doi.org/10.1101/163220

Motivation: The powerful learning ability of a convolutional neural network (CNN) to perform functional classification of DNA/RNA sequences could provide valuable clues for the discovery of underlying biological mechanisms. Currently, however, the only way to interpret the direct application of a convolutional kernel to DNA/RNA sequences is the heuristic construction of a position weight matrix (PWM) from fragments scored highly by that kernel; whether the resulting PWM still performs the sequence classification well is unclear. Results: We developed a novel kernel-to-PWM transformation whose result is theoretically provable. Specifically, we proved that the log-likelihood of the resulting PWM of any DNA/RNA sequence is exactly the sum of a constant and the convolution of the original kernel on the same sequence. Importantly, we further proved that the resulting PWM demonstrates the same performance, in theory, as the original kernel under popular CNN frameworks. Surprisingly, our PWMs almost always outperformed heuristic ones at sequence classification, whether the discriminative motif was sequence- or structure-conserved. These results compelled us to further develop a maximum likelihood estimation of the optimal PWM for each kernel and a back-transformation of predefined PWMs into kernels. These tools can benefit the biological interpretation of kernel signals. Availability: Python scripts for the transformation from kernel to PWM, the inverted transformation from PWM to kernel, and the maximum likelihood estimation of optimal PWM are available through ftp://ftp.cbi.pku.edu.cn/pub/software/CBI/k2p.

Opening this for discussion. I didn't read it, but the amount of related work they're missing is an immediate red flag. And this line in the abstract isn't correct:

Currently, however, the only way to interpret the direct application of a convolutional kernel to DNA/RNA sequences is the heuristic construction of a position weight matrix (PWM) from fragments scored highly by that kernel

akundaje commented 7 years ago

The most fundamental flaw here is interpreting conv. filters as PWMs. No single filter is ever going to actually capture a complete representation of the binding motif it might be partially capturing. DNNs learn distributed representations. So there are and will be multiple partially redundant filters that collectively model a binding site. More so, for deeper networks the conv. filters in the first layer often don't even remotely resemble known PWMs/motifs. The higher layers are often learning these. This may be actually more useful for initializing CNNs with known PWMs by transforming the PWMs to a more appropriately scaled but equivalent (in terms of performance) conv. filter.