the-aerospace-corporation / brainblocks

Practical Tool for Building ML Applications with HTM-Like Algorithms
GNU Affero General Public License v3.0
61 stars 13 forks source link

Is it ok that the BBClassifier take more time than the others? #11

Closed AlexSteveChungAlvarez closed 1 year ago

AlexSteveChungAlvarez commented 3 years ago

I've run you comparison script between scikit learn classifiers and BBClassifier and got the following: Train BBClassifier Time 0.507291s with size 400 Score: 0.95 Time 0.095903s with size 100 Decision Space Time 9.145416s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.98 Time 0.004031s with size 100 Decision Space Time 0.016021s with size 10000

Train Decision Tree Time 0.003995s with size 400 Score: 0.96 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 2.704887s with size 400 Score: 0.89 Time 0.000000s with size 100 Decision Space Time 0.027994s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 0.88 Time 0.000000s with size 100 Decision Space Time 0.003962s with size 10000


Data Set 1 Train BBClassifier Time 0.499331s with size 400 Score: 0.95 Time 0.091877s with size 100 Decision Space Time 8.824216s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.97 Time 0.003994s with size 100 Decision Space Time 0.020008s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 0.94 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 4.597880s with size 400 Score: 0.93 Time 0.000000s with size 100 Decision Space Time 0.051976s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 0.86 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 2 Train BBClassifier Time 0.463379s with size 400 Score: 0.88 Time 0.087886s with size 100 Decision Space Time 8.981609s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.93 Time 0.003958s with size 100 Decision Space Time 0.019975s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 0.82 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 5.996015s with size 400 Score: 0.9 Time 0.003991s with size 100 Decision Space Time 0.047926s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 0.79 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 3 Train BBClassifier Time 0.475371s with size 400 Score: 0.9 Time 0.087878s with size 100 Decision Space Time 8.768315s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.86 Time 0.003963s with size 100 Decision Space Time 0.016013s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 0.9 Time 0.000000s with size 100 Decision Space Time 0.004001s with size 10000

Train Neural Net Time 1.586036s with size 400 Score: 0.85 Time 0.003994s with size 100 Decision Space Time 0.019975s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 0.85 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 4 Train BBClassifier Time 0.487316s with size 400 Score: 0.85 Time 0.091909s with size 100 Decision Space Time 8.940060s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.89 Time 0.004026s with size 100 Decision Space Time 0.015980s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 0.86 Time 0.000000s with size 100 Decision Space Time 0.003963s with size 10000

Train Neural Net Time 1.725927s with size 400 Score: 0.88 Time 0.004011s with size 100 Decision Space Time 0.023973s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 0.88 Time 0.000000s with size 100 Decision Space Time 0.003995s with size 10000


Data Set 5 Train BBClassifier Time 0.535281s with size 400 Score: 0.91 Time 0.087851s with size 100 Decision Space Time 8.884135s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 0.96 Time 0.003996s with size 100 Decision Space Time 0.015978s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 0.85 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 2.696733s with size 400 Score: 0.86 Time 0.003995s with size 100 Decision Space Time 0.023904s with size 10000

Train Naive Bayes Time 0.003962s with size 400 Score: 0.75 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 6 Train BBClassifier Time 1.018634s with size 819 Score: 0.9902439024390244 Time 0.179763s with size 205 Decision Space Time 8.868124s with size 10000

Train Nearest Neighbors Time 0.000000s with size 819 Score: 0.9951219512195122 Time 0.007991s with size 205 Decision Space Time 0.015984s with size 10000

Train Decision Tree Time 0.000000s with size 819 Score: 0.926829268292683 Time 0.003961s with size 205 Decision Space Time 0.003937s with size 10000

Train Neural Net Time 1.594129s with size 819 Score: 0.926829268292683 Time 0.004012s with size 205 Decision Space Time 0.023987s with size 10000

Train Naive Bayes Time 0.003994s with size 819 Score: 0.926829268292683 Time 0.000000s with size 205 Decision Space Time 0.004026s with size 10000


Data Set 7 Train BBClassifier Time 0.479328s with size 400 Score: 0.99 Time 0.099867s with size 100 Decision Space Time 8.840194s with size 10000

Train Nearest Neighbors Time 0.003958s with size 400 Score: 1.0 Time 0.004027s with size 100 Decision Space Time 0.015981s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 1.0 Time 0.003961s with size 100 Decision Space Time 0.003921s with size 10000

Train Neural Net Time 1.318275s with size 400 Score: 1.0 Time 0.004008s with size 100 Decision Space Time 0.024009s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 1.0 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 8 Train BBClassifier Time 0.583251s with size 400 Score: 1.0 Time 0.095873s with size 100 Decision Space Time 8.852146s with size 10000

Train Nearest Neighbors Time 0.000000s with size 400 Score: 1.0 Time 0.003905s with size 100 Decision Space Time 0.020006s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 1.0 Time 0.000000s with size 100 Decision Space Time 0.003994s with size 10000

Train Neural Net Time 2.580572s with size 400 Score: 1.0 Time 0.000000s with size 100 Decision Space Time 0.047916s with size 10000

Train Naive Bayes Time 0.003994s with size 400 Score: 1.0 Time 0.000000s with size 100 Decision Space Time 0.000000s with size 10000


Data Set 9 Train BBClassifier Time 0.479390s with size 400 Score: 1.0 Time 0.091846s with size 100 Decision Space Time 8.888057s with size 10000

Train Nearest Neighbors Time 0.004019s with size 400 Score: 1.0 Time 0.007938s with size 100 Decision Space Time 0.039988s with size 10000

Train Decision Tree Time 0.000000s with size 400 Score: 1.0 Time 0.003985s with size 100 Decision Space Time 0.003894s with size 10000

Train Neural Net Time 3.119840s with size 400 Score: 1.0 Time 0.003958s with size 100 Decision Space Time 0.047935s with size 10000

Train Naive Bayes Time 0.000000s with size 400 Score: 1.0 Time 0.000000s with size 100 Decision Space Time 0.003994s with size 10000


It takes more time than the scikit learn classifiers for all the data-sets . Just takes lower time in training than Neural Networks. But in this page https://discourse.numenta.org/t/releasing-brainblocks-0-6-building-ml-applications-with-htm-like-algorithms/7675/6 you said it achieved one-shot learning. Is there something wrong I am doing?

jacobeverist commented 3 years ago

Well, in that script, BrainBlocks trains with the exact same datasets as all of the other classifiers. So it trains on 100, 400, 10000 inputs and that understandably takes time.

To demonstrate the near-one-shot learning, I would modify the script to take an even smaller subset of the training data for BrainBlocks to train on and see how that performs. You should then see BrainBlocks' ability to learn with very few examples while the performance of the other classifiers should degrade.

Please let me know how it turns out. This is probably an experiment we should have posted long ago.

AlexSteveChungAlvarez commented 3 years ago

I wish I had asked you before haha. I am actually investigating about HTM advantages over ML for a university class called thesis seminar 1, and I found brainblocks in the page, so yesterday was my first interaction with the code. I am going to figure out how exactly it works and give it a try! Thank you for answering fast.

jacobeverist commented 3 years ago

Can you show me what page this is? I'm interested.

AlexSteveChungAlvarez commented 3 years ago

Oh I was talking about numenta's forum https://discourse.numenta.org/ haha. I found it after watching the Numenta School playlist and looking for some code to help me I found Brainblocks in the first link I attached.

AlexSteveChungAlvarez commented 3 years ago

Well, in that script, BrainBlocks trains with the exact same datasets as all of the other classifiers. So it trains on 100, 400, 10000 inputs and that understandably takes time.

To demonstrate the near-one-shot learning, I would modify the script to take an even smaller subset of the training data for BrainBlocks to train on and see how that performs. You should then see BrainBlocks' ability to learn with very few examples while the performance of the other classifiers should degrade.

Please let me know how it turns out. This is probably an experiment we should have posted long ago.

Hi jacob, I already modified the script, but it didn´t result how I expected... Here is the output:

Data Set 0 Train BBClassifier Time 0.041895s with size 40 Score: 0.9 Time 0.010964s with size 10 Decision Space Time 7.844548s with size 10000

Train Nearest Neighbors Time 0.002992s with size 40 Score: 1.0 Time 0.000997s with size 10 Decision Space Time 0.004987s with size 10000

Train Decision Tree Time 0.002994s with size 40 Score: 1.0 Time 0.001006s with size 10 Decision Space Time 0.000988s with size 10000

Train Neural Net Time 0.271275s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.033911s with size 10000

Train Naive Bayes Time 0.000000s with size 40 Score: 1.0 Time 0.000998s with size 10 Decision Space Time 0.001994s with size 10000


Data Set 1 Train BBClassifier Time 0.046907s with size 40 Score: 0.7 Time 0.007978s with size 10 Decision Space Time 7.197762s with size 10000

Train Nearest Neighbors Time 0.000000s with size 40 Score: 0.9 Time 0.000998s with size 10 Decision Space Time 0.005983s with size 10000

Train Decision Tree Time 0.000993s with size 40 Score: 0.9 Time 0.000000s with size 10 Decision Space Time 0.000997s with size 10000

Train Neural Net Time 0.331198s with size 40 Score: 0.9 Time 0.000000s with size 10 Decision Space Time 0.032945s with size 10000

Train Naive Bayes Time 0.000998s with size 40 Score: 0.9 Time 0.000997s with size 10 Decision Space Time 0.000997s with size 10000


Data Set 2 Train BBClassifier Time 0.045877s with size 40 Score: 0.6 Time 0.009974s with size 10 Decision Space Time 7.116877s with size 10000

Train Nearest Neighbors Time 0.000000s with size 40 Score: 1.0 Time 0.000998s with size 10 Decision Space Time 0.008982s with size 10000

Train Decision Tree Time 0.000995s with size 40 Score: 0.9 Time 0.000000s with size 10 Decision Space Time 0.000998s with size 10000

Train Neural Net Time 0.358043s with size 40 Score: 0.9 Time 0.000999s with size 10 Decision Space Time 0.037928s with size 10000

Train Naive Bayes Time 0.000998s with size 40 Score: 0.9 Time 0.000998s with size 10 Decision Space Time 0.002028s with size 10000


Data Set 3 Train BBClassifier Time 0.040892s with size 40 Score: 0.8 Time 0.011968s with size 10 Decision Space Time 7.517782s with size 10000

Train Nearest Neighbors Time 0.000998s with size 40 Score: 0.8 Time 0.002013s with size 10 Decision Space Time 0.007959s with size 10000

Train Decision Tree Time 0.000995s with size 40 Score: 0.8 Time 0.000973s with size 10 Decision Space Time 0.000996s with size 10000

Train Neural Net Time 0.276260s with size 40 Score: 0.8 Time 0.000998s with size 10 Decision Space Time 0.023938s with size 10000

Train Naive Bayes Time 0.000937s with size 40 Score: 0.8 Time 0.000000s with size 10 Decision Space Time 0.000994s with size 10000


Data Set 4 Train BBClassifier Time 0.044880s with size 40 Score: 0.6 Time 0.007978s with size 10 Decision Space Time 7.712677s with size 10000

Train Nearest Neighbors Time 0.000000s with size 40 Score: 0.8 Time 0.000998s with size 10 Decision Space Time 0.006981s with size 10000

Train Decision Tree Time 0.000000s with size 40 Score: 0.6 Time 0.000000s with size 10 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 0.261299s with size 40 Score: 0.4 Time 0.000000s with size 10 Decision Space Time 0.033915s with size 10000

Train Naive Bayes Time 0.000995s with size 40 Score: 0.7 Time 0.000000s with size 10 Decision Space Time 0.001998s with size 10000


Data Set 5 Train BBClassifier Time 0.049865s with size 40 Score: 0.7 Time 0.006982s with size 10 Decision Space Time 7.270467s with size 10000

Train Nearest Neighbors Time 0.000996s with size 40 Score: 0.9 Time 0.000998s with size 10 Decision Space Time 0.008973s with size 10000

Train Decision Tree Time 0.000998s with size 40 Score: 0.8 Time 0.000000s with size 10 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 0.241446s with size 40 Score: 0.3 Time 0.000000s with size 10 Decision Space Time 0.024933s with size 10000

Train Naive Bayes Time 0.000000s with size 40 Score: 0.7 Time 0.000000s with size 10 Decision Space Time 0.000998s with size 10000


Data Set 6 Train BBClassifier Time 0.041887s with size 39 Score: 0.0 Time 0.008977s with size 10 Decision Space Time 7.629043s with size 10000

Train Nearest Neighbors Time 0.000000s with size 39 Score: 1.0 Time 0.002026s with size 10 Decision Space Time 0.006981s with size 10000

Train Decision Tree Time 0.000999s with size 39 Score: 0.9 Time 0.000000s with size 10 Decision Space Time 0.000971s with size 10000

Train Neural Net Time 0.344298s with size 39 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.024966s with size 10000

Train Naive Bayes Time 0.000000s with size 39 Score: 1.0 Time 0.000970s with size 10 Decision Space Time 0.000997s with size 10000


Data Set 7 Train BBClassifier Time 0.041857s with size 40 Score: 0.8 Time 0.007978s with size 10 Decision Space Time 7.199733s with size 10000

Train Nearest Neighbors Time 0.000997s with size 40 Score: 1.0 Time 0.002018s with size 10 Decision Space Time 0.007011s with size 10000

Train Decision Tree Time 0.000000s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.000998s with size 10000

Train Neural Net Time 0.296723s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.057845s with size 10000

Train Naive Bayes Time 0.000993s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.001998s with size 10000


Data Set 8 Train BBClassifier Time 0.042876s with size 40 Score: 0.9 Time 0.006984s with size 10 Decision Space Time 7.615486s with size 10000

Train Nearest Neighbors Time 0.000996s with size 40 Score: 1.0 Time 0.001029s with size 10 Decision Space Time 0.007949s with size 10000

Train Decision Tree Time 0.000000s with size 40 Score: 1.0 Time 0.000998s with size 10 Decision Space Time 0.001000s with size 10000

Train Neural Net Time 0.352059s with size 40 Score: 1.0 Time 0.000997s with size 10 Decision Space Time 0.031914s with size 10000

Train Naive Bayes Time 0.000998s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.001995s with size 10000


Data Set 9 Train BBClassifier Time 0.051861s with size 40 Score: 0.9 Time 0.009987s with size 10 Decision Space Time 7.890846s with size 10000

Train Nearest Neighbors Time 0.000000s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.005968s with size 10000

Train Decision Tree Time 0.000997s with size 40 Score: 1.0 Time 0.000998s with size 10 Decision Space Time 0.000000s with size 10000

Train Neural Net Time 0.427484s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.023936s with size 10000

Train Naive Bayes Time 0.000997s with size 40 Score: 1.0 Time 0.000000s with size 10 Decision Space Time 0.001997s with size 10000

classifier_comparison The 10000 size is from the decision phase, which I don't know how to change the size, but I think it just depends on the training and test phases, which I changed to 10 and 40. It doesn't seem like BBClassifier is performing how it should with smaller subsets, or maybe I did the experiment wrong, would you help me?

jacobeverist commented 3 years ago

I think you reduced the overall data, but should have reduced the training set only.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=rand_seed)

Try increasing the test_size to something like 0.5 to 0.9 and see how that changes things. But keep the num_samples at 500.

AlexSteveChungAlvarez commented 3 years ago

I did and these are the results: For 0.5: classifier_comparison For 0.7: classifier_comparison70 For 0.9: classifier_comparison90 Still can't see the results we want...

jacobeverist commented 3 years ago

Thanks for the results. This is very interesting.

It should be noted that this is a very low-dimensional dataset, there or only a handful of classes, and the ML algorithms its being compared to are also very shallow learners. This experiment was setup to provide a visual explanation of how BrainBlocks learns the decision space compared other ML algorithms. A more rigorous analysis would involve using high dimensional datasets and state-of-the-art deep learning architectures.

However, the key takeaway from this set of results is that BrainBlocks gives a large "region of ignorance" where many of the other classifiers give high confidence for spaces of data of which it has no experience. BrainBlocks gives you a built-in novelty detector so that you know if your inputs are out-of-family from your trained inputs. Other classifiers will not consistently give you the same courtesy.

With regard to one-shot learning, we need to carefully define and scope what would be a success. Usually this term is used in the context of image classification and a defined scope of possible image variance. Although I would need to investigate the specifics.

In our experience, when I've used the term "one-shot learning", I usually mean this in terms of learning time-series sequences. With our current SequenceLearner, we can learn a sequence in one pass and recognize it when it comes around again. In the past, we had to train on the same sequence multiple times with HTM and other time-series algorithms to successfully learn a sequence. We stumbled on a way of learning this in one-shot.

So maybe I was a little bit hopeful in saying that BrainBlocks can do "one-shot" learning without defining and scoping it and setting up experiments to prove it.

jacobeverist commented 1 year ago

Not an issue but an interesting experiment