Closed WojciechMigda closed 6 years ago
Thanks for your accurate observation @WojciechMigda! The Multi-Class Tsetlin Machine, described in Section 6.1 in the paper, actually provides more flexible representation of patterns than the basic Tsetlin Machine. This is because it tracks learning of output '0' and '1' in two separate Tsetlin Automata structures (see Figure 9), which in turn helps handling the large amount of noise in the dataset. This opens up for comparing the output of the two structures by means of the argmax operation (again see Figure 9). That being said, try setting number of clauses to 10, 's' to 3.75, and increase the threshold 'T' to 25, and you should get similar performance as the Multi-Class Tsetlin Machine. Hope this answered your question!
Hi,
I took the
NoisyXORDemo.py
example and simply replaced multi class model with its basic two-class version. The hyper parameters remained the same, the only extra tweak I needed to place was removal of # of class argument from model construction. I expected to see similar, if not identical, results, but multi class version seems to have an edge over simple two-class implementation. For instance when I repeatedly run the multi class version I get:the accuracy on test data is almost always 1.0, and accuracy on training data varies very little from 0.603.
For the two-class basic model I get
Both accuracy values vary significantly more and are worse. Do you maybe have comments on those results? Thanks.