Apply self attention weight

We actually implemented both a transformed and a direct version of self-attention. This specific version was intended as an initial, quick validation to focus on the core attention mechanism, where key, query, and value are directly used without transformations. However, as described in our paper, the main implementation uses learnable parameters to transform K, Q, and V. You can check the latest code update to select the version that best fits your needs.

rongzhou7 / ADCCA

Apply self attention weight #5