hyperdimensional-computing / torchhd

Torchhd is a Python library for Hyperdimensional Computing and Vector Symbolic Architectures
https://torchhd.readthedocs.io
MIT License
248 stars 25 forks source link

Question about the models. #141

Closed TimmieTudor closed 1 year ago

TimmieTudor commented 1 year ago

I decided to run the MNIST example from examples, and after running it, I got the Testing accuracy of 83.120%, which is surprisingly low when compared to the accuracy of neural networks. My actual question is: Is there a way to have better accuracy when training and testing models? I asked this question here because I didn't know where to ask it. By the way I am very new to HD computing, so I also don't know why the accuracy is low compared to neural networks. Also from what I've seen, most neural networks trained on the MNIST dataset have 90% accuracy or over.

TimmieTudor commented 1 year ago

Oh, wait... The state-of-the-art HDC model on MNIST has an accuracy of 89% [Chuang et al., 2020] A two-layer NN, however, can easily achieve 95% [Lecun et al., 1998]. By the way I found this information in a research paper titled "Understanding Hyperdimensional Computing for Parallel Single-Pass Learning"

mikeheddes commented 1 year ago

Hi, thank you for opening this issue. Your observation is right, current state-of-the-art in HDC cannot compete with neural networks on image classification tasks. There is still ongoing research that tries to fill this gab.

That said, there are some things you could do to improve the accuracy of the pure HDC learning example. The mnist_nonlinear example uses the simplest form of training, it just bundles all the vectors of the same class together using Centroid.add. You can expect to get higher accuracy using a more advanced training algorithm such as Centroid.add_online which you could even do multiple iterations over the training data with to further improve the accuracy. This should likely get you above 90%.

I hope this is useful.

mikeheddes commented 1 year ago

I also noticed that the images are not normalized in the mnist_nonlinear example, which might improve the accuracy slightly more as well.

If you are going to implement these changes feel free to open a pull request so that we can update the examples to have better accuracy.

rgayler commented 1 year ago

@ComradeCat1 please allow me to make a general/philosophical observation comparing HDC/VSA with NN.

In standard NN the synaptic weight matrices are modifiable and the focus is on training the weights to transform the inputs to a good set of features that allow your task to be completed with high accuracy (or whatever).

In VSA/HDC the algebraic operators (bind, bundle, permute) take the role of the synapses and can be implemented by vector-matrix multiplications. These operators (and the matrices implementing them) are fixed. The consequence is that for the most obvious VSA/HDC implementations the features you use are the raw inputs or features you have explicitly constructed.

So if you want to compete with NN on some performance metric like accuracy you either have to know what features are predictive and explicitly construct them or set up some VSA circuit that calculates lots of features (hopefully, including the good ones) and use that as your input.

TimmieTudor commented 1 year ago

@rgayler Thank you for telling me the main difference between NN and HDC/VSA. By the way I discovered HDC/VSA thanks to a news article introducing it while also talking about NNs. Also I was curious to see what alternatives there are to NNs, because I've already seen lots of NNs. I think I'll try to care less about performance and keep experimenting with HDC.

rgayler commented 1 year ago

@ComradeCat1 I don't know that I'd call it the main difference, more something to keep in mind when comparing.

VSA/HDC and NN are (probably) equally expressive when considered as statistical models of data, so in the limit I expect them to be equally accurate (or whatever other metric). That is, the ultimate limit on accuracy arises from how predictable the data is.

In HDC/VSA the extent to which your model under-performs the data limit is down to you not having engineered the right features.

In NN the extent to which your model under-performs the data limit is down to your choice of architecture/loss-function/etc. not allowing the learning algorithm to extract maximum value from the data.

IMO the more interesting comparisons are around metrics other than accuracy. For a given level of accuracy (or whatever) VSA tends to have far fewer weights, require less training data, and be computationally much cheaper than NN. So the interesting questions about HDC/VSA are around what you can do with VSA that you can't reasonably do with NN (which is a very open research question).

There are lots of recorded VSA webinars here, if you're interested: https://sites.google.com/ltu.se/vsaonline This might also be helpful: https://www.hd-computing.com/home

Would you mind sending me a copy of or link to that news article you saw?

mikeheddes commented 1 year ago

I will close this due to inactivity, feel free to reopen to continue the discussion.