Separate the similarity functions into angle and magnitude functions

I find it useful to think of VSAs as analog computers for computation on discrete structures. Hypervectors in the VSA correspond to wires in the electronic analog computer. The direction (angle) of the vector corresponds to the label on the wire (i.e. the variable or what it represents) and the magnitude of the vector corresponds to the voltage on the wire (the value of the variable or the degree of support for the represented thing). In this view the direction and magnitude of a vector are conceptually distinct attributes and have different uses.

The usual similarity functions (between two vectors) take the angle and magnitude information and combine it into a single concept of similarity, but I think it is helpful to keep them more separated because they answer different questions.

The angle between two vectors indicates the degree to which they represent the same thing - so this answers questions about what is represented independently of how much support there is for what is being represented. Note that when the angle between two vectors is anything other than quasi-orthogonal this evidence that with very high probability the vectors are in a bundling relationship (because of concentration of measure in hyperdimensional vector spaces). Angular similarity answers questions about the existence of bundling.

Rather than using the actual angle between vectors it is more convenient to use the cosine of the angle between them (because this is equivalent to the dot product of the normalised arguments). Call this function something like similarity.cos.

Long story short - the natural magnitude similarity function is the dot product, which is interpreted as the magnitude of the projection of one argument vector onto the other. This depends on the magnitudes of the two argument vectors and the angle between them. If you have a dictionary of unit magnitude atomic hypervectors and a weighted bundle of them (a multiset), the dot product of each of the dictionary items with the bundle tells you the degree to which that item is present in the bundle (i.e. the scalar weight of that item in the bundle). So dot product is essential to tell you about the degree to which vectors are present. Call this function something like similarity.dot.

In a recurrent VSA circuit a typical approach is to have the solution represented as a weighted bundle (multiset) with the set of items in the bundle fixed over iterations, and the weights associated with each item evolving to indicate the degree of support for that item being in the solution (e.g. see https://www.researchgate.net/profile/Ross-Gayler/publication/310752006_A_distributed_basis_for_analogical_mapping/links/583c01a708aed5c6148cbe23/A-distributed-basis-for-analogical-mapping.pdf).

Hamming distance is isomorphic to dot product similarity with a rescaling of the boolean values to bipolar values.

hyperdimensional-computing / torchhd

Separate the similarity functions into angle and magnitude functions #80