Open ricpruss opened 4 years ago
Hi @ricpruss! Started making documentation at readthedocs, where I will add some guidelines for setting the hyperparameters. In the meantime, basically, T decides how many clauses are involved in recognizing each particular pattern. If T=5, only five clauses are needed to represent each sub-pattern. A larger T value involves an increasing number of clauses, improving the representation of patterns. A rule of thumb seems to be that doubling both T and #clauses together automatically gives an increase in accuracy. The s-parameter decides the precision of the rules (frequency of the patterns). A larger s makes more detailed clauses, with lower frequency. For s, I typically explore values in the range 1.0 - 20.0. For T, exploring the range 1-200 may be worthwhile. However, with weighted clauses, much larger T-values must be used. With a max_weight of 255, multiplying T with 100 gives good results for classification, while multiplying with 10 seems to work well for regression.
Thanks for that. The intuition behind that really works very well. You have created something amazing, I cannot believe everyone is not talking about this.
s, T are defined in the paper but not given much explanation in the code or the paper really. For applying these models a little bit of guidance on effects of all the values would be super interesting.
class MultiClassTsetlinMachine(CommonTsetlinMachine): def init(self, number_of_clauses, T, s, boost_true_positive_feedback=1, number_of_state_bits=8, append_negated=True, max_weight=1, number_of_classes = None):