Hi author, I have a question: the so-called cosine similarity is actually a vector dot product, and there is no real cosine at all. When the gradient is calculated, there is only multiplication, there is no cos at all, and there is no so-called saturation region where the gradient disappears. Can you explain?
Hi author, I have a question: the so-called cosine similarity is actually a vector dot product, and there is no real cosine at all. When the gradient is calculated, there is only multiplication, there is no cos at all, and there is no so-called saturation region where the gradient disappears. Can you explain?