cair / TsetlinMachine

Code and datasets for the Tsetlin Machine
https://arxiv.org/abs/1804.01508
MIT License
466 stars 51 forks source link

Accuracy of TsetlinMachine vs. MultiClassTsetlinMachine #3

Closed WojciechMigda closed 6 years ago

WojciechMigda commented 6 years ago

Hi,

I took the NoisyXORDemo.py example and simply replaced multi class model with its basic two-class version. The hyper parameters remained the same, the only extra tweak I needed to place was removal of # of class argument from model construction. I expected to see similar, if not identical, results, but multi class version seems to have an edge over simple two-class implementation. For instance when I repeatedly run the multi class version I get:

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemo.py 
Accuracy on test data (no noise): 1.0
Accuracy on training data (40% noise): 0.603

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemo.py 
Accuracy on test data (no noise): 1.0
Accuracy on training data (40% noise): 0.603

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemo.py 
Accuracy on test data (no noise): 1.0
Accuracy on training data (40% noise): 0.603

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

the accuracy on test data is almost always 1.0, and accuracy on training data varies very little from 0.603.

For the two-class basic model I get

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemoBasic.py 
Accuracy on test data (no noise): 0.965
Accuracy on training data (40% noise): 0.595

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemoBasic.py 
Accuracy on test data (no noise): 0.9004
Accuracy on training data (40% noise): 0.5832

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

wojtek@wojtek-dell:[(HEAD detached at 5678644)]:/repo/Tsetlin/TsetlinMachine-cair-wmigda$ python NoisyXORDemoBasic.py 
Accuracy on test data (no noise): 0.9476
Accuracy on training data (40% noise): 0.599

Prediction: x1 = 1, x2 = 0, ... -> y =  1
Prediction: x1 = 0, x2 = 1, ... -> y =  1
Prediction: x1 = 0, x2 = 0, ... -> y =  0
Prediction: x1 = 1, x2 = 1, ... -> y =  0

Both accuracy values vary significantly more and are worse. Do you maybe have comments on those results? Thanks.

olegranmo commented 6 years ago

Thanks for your accurate observation @WojciechMigda! The Multi-Class Tsetlin Machine, described in Section 6.1 in the paper, actually provides more flexible representation of patterns than the basic Tsetlin Machine. This is because it tracks learning of output '0' and '1' in two separate Tsetlin Automata structures (see Figure 9), which in turn helps handling the large amount of noise in the dataset. This opens up for comparing the output of the two structures by means of the argmax operation (again see Figure 9). That being said, try setting number of clauses to 10, 's' to 3.75, and increase the threshold 'T' to 25, and you should get similar performance as the Multi-Class Tsetlin Machine. Hope this answered your question!