Deep Forest: Towards An Alternative to Deep Neural Networks

traversc commented 7 years ago

https://arxiv.org/pdf/1702.08835.pdf

Abstract:

In this paper, we propose gcForest, a decision tree ensemble approach with performance highly competitive to deep neural networks. In contrast to deep neural networks which require great effort in hyperparameter tuning, gcForest is much easier to train. Actually, even when gcForest is applied to different data from different domains, excellent performance can be achieved by almost same settings of hyper-parameters. The training process of gcForest is efficient and scalable. In our experiments its training time running on a PC is comparable to that of deep neural networks running with GPU facilities, and the efficiency advantage may be more apparent because gcForest is naturally apt to parallel implementation. Furthermore, in contrast to deep neural networks which require large-scale training data, gcForest can work well even when there are only small-scale training data. Moreover, as a treebased approach, gcForest should be easier for theoretical analysis than deep neural networks.

Really quite interesting approach. They combine the strength of random forests (no need for parameter tuning) with the strength deep learning (convolutions). It could turn out to be extremely good. However, I would say that they haven't proven their point. You can see on the test datasets, the performance is not much better than classic algorithms like SVM. I think they will need to use more difficult image datasets, where classic algorithms do not have a chance against deep learning.

swamidass commented 7 years ago

Honestly not impressed at all and I feel it should not be included favorably. I like the layer structure and appreciate the difficult of getting this to work on images well. However, even if we grant that this approach performs well, it is not nearly as adaptable as neural networks. For example, could this work for learning GO and Poker or Quantum Chemistry or Word2Vect like embeddings? I'm doubtful. Optimistically, it work standard supervised tasks, but also is very algorithmic heavy (compared to the easy of doing deep learning with modern libraries) and cannot even make use of simple tricks. The final output can approximate a probabilistic distribution (good) but it is a descritized approximation that is not easily differentiable (bad).

Perhaps more importantly, with modern activation functions like ReLu, it is possible to represent neural networks in a set of rules too. I'm not sure what the motivation there is to look for "an alternative to Deep Learning". The reasoning they give is specious.

I think that forests are easier to analyze than SVMs but given the rapidly growing immense body of literature doing theoretical analysis on deep learning, it is hard to see "difficulty in analyzing" DL is a valid problem to overcome. In particular, the discrete nature of trees makes theoretical analysis more difficult, compared to DL.

Likewise the complain about small-scale training is specious too. Often this is resolved by reducing the complexity of the networks (reducing the number of layers), by better structures, autoencoding (not done in this study), by multitasking, or by pretraining (or a combination of all three). This (currently) is not usually automated in DL, so special care needs to be done to match the number of parameters to the data in comparisons like these (which these authors did not do). In contrast (as good feature), decision trees automatically scale to the size of the data, so this is embedded in the framework. However, this is by no means a fixed attribute of DL. In domains were people are data poor, it is easy to envision automated approaches to scaling DL to problem size.

Sorry to be so negative on this. The one redeeming point of the article is the multilayered tree architecture which is interesting. But it is not at all clear how useful this will be.

traversc commented 7 years ago

Thanks for sharing your expertise! However, when you say that neural networks are adaptable and can learn things like GO, isn't that an oversimplification? It's not that a neural network alone can solve GO; Alpha GO used a combination of deep learning and tree search, as in a chess engine.

swamidass commented 7 years ago

Yes, but part of DeepLearning (as we have defined it) is a collection of approaches, which includes training strategies.

For example, policy learning and reinforcement learning have been critical in applying DL to games like GO. However, it is not clear or certain that DeepTrees could be adapted to use this. Usually, you need to have a well-scaled probability output to used these more advanced training strategies, and it is not clear if trees could work without dramatically reduced computational efficiency from increasing the number of tree samples. And I would emphasize again that policy learning and reinforcement learning is a DL algorithm.

The tree search algorithm is actually part of the policy learning, by the way. So it is not correct to separate this out. It is more accurate to say that the network is learning how to efficiently tree search.

The bigger problem though with this line of inquiry is that we need to see some foreseeable advantage of trees over DL, other than just minor performance problems.

The only potential improvements I can see is that (1) forests are probably easier to parallelize (embarrassingly), (2) it might be possible to build trees a type of layer to be composed with other strategies, which may be useful in certain domains, especially dense embedding, and (3) forests do automatically scale to problem size which is a very nice and useful feature.

Of these, only #3 is demonstrated here (indirectly), and the rest is speculative.

asfix commented 7 years ago

Hello, i would like to apply gcForest on my logo dataset. However, I don't know where to start from. First of all, i need an implementation. Is there any implementation of gcForest rather than Python version. For example, c++ or R would be great. Can anybody help or guide me? I have seriously involved in this method.

Any kind of help will be appreciated.

dhimmel commented 7 years ago

@asfix wrong place. This repository is a review paper on deep learning methods, not a help channel for those methods. Have you tried getting in touch with the paper authors?

However, feel free to report anything back here that you learn. If there is not an open source implementation of gcForest, that is something we should take into account when discussing (or not discussing) the paper.

asfix commented 7 years ago

@dhimmel oh sorry. by the way, i have searched but did not find anything till to yesterday. I will do search again. I will report if I find. Thanks anyway

Laurae2 commented 7 years ago

Deep Forest is a stacking fest, nothing special about it (it was existing for a while, but no one made a name out of it other than stacking).

See here for a more detailed discussion about the issues of Deep Forest and irregularities found on the paper / about the authors: https://github.com/Microsoft/LightGBM/issues/331 - with independent implementation tests.

isthisthat commented 7 years ago

Hello, I thought it was an interesting paper and I like the idea of keeping some of the neural net trickery (cascades, multi-grained scanning) while changing the engine (to forests). However it wasn't at all clear how gcForest is actually trained since the authors seem to muddle up training and testing and address them both at the same time?! e.g. does the input feature vector in Fig.1 contain only one sample or can it be a matrix (with positive and negative examples for training)? Do you then concatenate the class vectors with the input matrix for the next level? And stop when your class vectors are good enough? This was not at all clear to me. Did anyone get the training bit? Thanks a lot.

greenelab / deep-review

Deep Forest: Towards An Alternative to Deep Neural Networks #268