Open connollyst opened 9 years ago
Hi Sean,
Thanks for the comments! In general I think you can put any classifier on top of the ensemble, not just a linear system. Many people use SVM's or decision trees for example. You could probably also use an HTM on top, though I haven't really thought through how you could do that. It's definitely an interesting idea!
I also think you naturally get some of the benefits of ensembles using hierarchical systems. In some sense they both decompose the input space into subsystems and the higher level combines these subsystems into making the final decision.
Cheers,
--Subutai
Hi Subutai,
I was just watching your discussion in the Jan 2015 office hour regarding this experiment. Very cool stuff, seems a sensible step forward to me. A question comes to mind though, I'd like to get your feedback & hope this is the correct venue. I'm just coming up to speed with CLA/HTM/Numenta so please correct me if I'm mistaken about anything here.
In my understanding of HTM the hierarchical refers to the layers of neurons, where the excitation (input) to neurons in one layer is the activity (output) of neurons in previous layer; the base layer neurons receiving raw sensor stimuli. Within each layer, the neurons are exciting & inhibiting each other dynamically through horizontal connections.
The advantage in an ensemble of HTMs then, is that 1) there are no horizontal connection between neither the neurons nor the layers of different HTM, and 2) each HTM gets to learn it's own patterns using different parameters for each.
The main issue you explored in this projects was how to interpret the different predictions from each HTM. Would it be sensible to just slap another HTM on top of the ensemble?
This top HTM would differ from the normal hierarchical concept in HTM, as there is absolutely no connection between the neurons. It would be receiving scalar input, the ensemble of predictions, and outputting a prediction itself.
Perhaps another approach would be to use the top HTM, not to make a prediction, but to predict appropriate weights; as you did with least squares.
Perhaps this would be a bit weird as the initial predictions are for a future time step (+1, +5, whatever) but the top HTM would be assuming that is the current time step - I think that's fine as long as we interpret the output correct.
I'll leave it at that. Is there any sense behind this? I'd love to get your input!
Cheers, Sean