I am definitely not an expert on this topic, but it seems to me that
stored keys, $K = W^K X$ should actually be $K = X (W^K)^\top$ where $^\top$ is transposition. Then similar for other matrices such as $Q$ and $V$.
And the dimension of $\boldsymbol x$ should not be $d_k$ but $d$. The dimension $d_k$ should be related to $K$.
I am definitely not an expert on this topic, but it seems to me that stored keys, $K = W^K X$ should actually be $K = X (W^K)^\top$ where $^\top$ is transposition. Then similar for other matrices such as $Q$ and $V$.
And the dimension of $\boldsymbol x$ should not be $d_k$ but $d$. The dimension $d_k$ should be related to $K$.