omer11a / bounded-attention

MIT License
77 stars 8 forks source link

Get the M of self-attn #9

Closed KoTion closed 4 months ago

KoTion commented 4 months ago

The M obtained by the function _hide_other_subjects_from_subjects is all 0, which means that the result of self-attn is still the result of Q*K, not the result of bounded attention.

The results of subject_masks and background_masks obtained in function _hide_other_subjects_from_subjects are as follows: image That is right

However, after the following code operation, the returned sim_masks are all 0.

omer11a commented 4 months ago

Great find. Thanks! I fixed it now