DeepLIFT contribution scores for one-hot encoded nucleotides

berkuva commented 1 year ago

Given that ACGT are represented by one-hot encoding [1,0,0,0], [0,1,0,0], [0,0,1,0], and [0,0,0,1] respectively, each element in the encoding is assigned a DeepLIFT score (each nucleotide gets 4 scores). For example, given input ACGT represented by [1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1] and hypothetical contribution scores [0.1, 0.07, -0.3, 0.01, 0.2, 0.09, -0.01, 0.8, 0.07, 0.02, 0.4, -0.1, -0.025, -1.0, 0.4, 0.35], how does BPNet assign one contribution score for each nucleotide?

Are the scores added such that the DeepLIFT contribution scores for ACGT are [0.1 + 0.07 - 0.3 + 0.01]/4, [0.2 + 0.09 - 0.01 + 0.8]/4, [0.07 + 0.02 + 0.4 - 0.1]/4, [-0.025 - 1.0 + 0.4 + 0.35]/4]? By looking at https://github.com/kundajelab/deeplift/issues/106#issuecomment-635192286, it looks like the sum method is used in the DeepLIFT paper, but it's not very clear to me if that was the case for BPNet. Thanks.

akundaje commented 1 year ago

You simply gate on the observed nucleotide at each position. That gives you the DeepLIFT score for that nucleotide and that position.

On Fri, Jan 6, 2023 at 2:04 PM Hyun Jae Cho @.***> wrote:

Given ACGT are represented by one-hot encoding [1,0,0,0], [0,1,0,0], [0,0,1,0], and [0,0,0,1] respectively, each element in the encoding are assigned a DeepLIFT score (each nucleotide gets 4 scores). For example, given input ACGT represented by [1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1] and hypothetical contribution scores [0.1, 0.07, -0.3, 0.01, 0.2, 0.09, -0.01, 0.8, 0.07, 0.02, 0.4, -0.1, -0.025, -1.0, 0.4, 0.35], how does BPNet assign one contribution score for each nucleotide?

— Reply to this email directly, view it on GitHub https://github.com/kundajelab/bpnet/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDWEMQ3QF5VOZUY7NASQ3WRCJIBANCNFSM6AAAAAATTQA76U . You are receiving this because you are subscribed to this thread.Message ID: @.***>

berkuva commented 1 year ago

Thank you.

kundajelab / bpnet

DeepLIFT contribution scores for one-hot encoded nucleotides #47