Open traversc opened 7 years ago
Honestly not impressed at all and I feel it should not be included favorably. I like the layer structure and appreciate the difficult of getting this to work on images well. However, even if we grant that this approach performs well, it is not nearly as adaptable as neural networks. For example, could this work for learning GO and Poker or Quantum Chemistry or Word2Vect like embeddings? I'm doubtful. Optimistically, it work standard supervised tasks, but also is very algorithmic heavy (compared to the easy of doing deep learning with modern libraries) and cannot even make use of simple tricks. The final output can approximate a probabilistic distribution (good) but it is a descritized approximation that is not easily differentiable (bad).
Perhaps more importantly, with modern activation functions like ReLu, it is possible to represent neural networks in a set of rules too. I'm not sure what the motivation there is to look for "an alternative to Deep Learning". The reasoning they give is specious.
I think that forests are easier to analyze than SVMs but given the rapidly growing immense body of literature doing theoretical analysis on deep learning, it is hard to see "difficulty in analyzing" DL is a valid problem to overcome. In particular, the discrete nature of trees makes theoretical analysis more difficult, compared to DL.
Likewise the complain about small-scale training is specious too. Often this is resolved by reducing the complexity of the networks (reducing the number of layers), by better structures, autoencoding (not done in this study), by multitasking, or by pretraining (or a combination of all three). This (currently) is not usually automated in DL, so special care needs to be done to match the number of parameters to the data in comparisons like these (which these authors did not do). In contrast (as good feature), decision trees automatically scale to the size of the data, so this is embedded in the framework. However, this is by no means a fixed attribute of DL. In domains were people are data poor, it is easy to envision automated approaches to scaling DL to problem size.
Sorry to be so negative on this. The one redeeming point of the article is the multilayered tree architecture which is interesting. But it is not at all clear how useful this will be.
Thanks for sharing your expertise! However, when you say that neural networks are adaptable and can learn things like GO, isn't that an oversimplification? It's not that a neural network alone can solve GO; Alpha GO used a combination of deep learning and tree search, as in a chess engine.
Yes, but part of DeepLearning (as we have defined it) is a collection of approaches, which includes training strategies.
For example, policy learning and reinforcement learning have been critical in applying DL to games like GO. However, it is not clear or certain that DeepTrees could be adapted to use this. Usually, you need to have a well-scaled probability output to used these more advanced training strategies, and it is not clear if trees could work without dramatically reduced computational efficiency from increasing the number of tree samples. And I would emphasize again that policy learning and reinforcement learning is a DL algorithm.
The tree search algorithm is actually part of the policy learning, by the way. So it is not correct to separate this out. It is more accurate to say that the network is learning how to efficiently tree search.
The bigger problem though with this line of inquiry is that we need to see some foreseeable advantage of trees over DL, other than just minor performance problems.
The only potential improvements I can see is that (1) forests are probably easier to parallelize (embarrassingly), (2) it might be possible to build trees a type of layer to be composed with other strategies, which may be useful in certain domains, especially dense embedding, and (3) forests do automatically scale to problem size which is a very nice and useful feature.
Of these, only #3 is demonstrated here (indirectly), and the rest is speculative.
Hello, i would like to apply gcForest on my logo dataset. However, I don't know where to start from. First of all, i need an implementation. Is there any implementation of gcForest rather than Python version. For example, c++ or R would be great. Can anybody help or guide me? I have seriously involved in this method.
Any kind of help will be appreciated.
@asfix wrong place. This repository is a review paper on deep learning methods, not a help channel for those methods. Have you tried getting in touch with the paper authors?
However, feel free to report anything back here that you learn. If there is not an open source implementation of gcForest
, that is something we should take into account when discussing (or not discussing) the paper.
@dhimmel oh sorry. by the way, i have searched but did not find anything till to yesterday. I will do search again. I will report if I find. Thanks anyway
Deep Forest is a stacking fest, nothing special about it (it was existing for a while, but no one made a name out of it other than stacking).
See here for a more detailed discussion about the issues of Deep Forest and irregularities found on the paper / about the authors: https://github.com/Microsoft/LightGBM/issues/331 - with independent implementation tests.
Hello, I thought it was an interesting paper and I like the idea of keeping some of the neural net trickery (cascades, multi-grained scanning) while changing the engine (to forests). However it wasn't at all clear how gcForest is actually trained since the authors seem to muddle up training and testing and address them both at the same time?! e.g. does the input feature vector in Fig.1 contain only one sample or can it be a matrix (with positive and negative examples for training)? Do you then concatenate the class vectors with the input matrix for the next level? And stop when your class vectors are good enough? This was not at all clear to me. Did anyone get the training bit? Thanks a lot.
https://arxiv.org/pdf/1702.08835.pdf
Abstract:
Really quite interesting approach. They combine the strength of random forests (no need for parameter tuning) with the strength deep learning (convolutions). It could turn out to be extremely good. However, I would say that they haven't proven their point. You can see on the test datasets, the performance is not much better than classic algorithms like SVM. I think they will need to use more difficult image datasets, where classic algorithms do not have a chance against deep learning.