pylablanche / gcForest

Python implementation of deep forest method : gcForest
MIT License
417 stars 193 forks source link

I test the code on mnist and another structured data #5

Closed Song-xx closed 6 years ago

Song-xx commented 7 years ago

but why my result show that the gcforest is not so powerfull when compared with cnn and xgboost,am I wrong somewhere?

pylablanche commented 7 years ago

Probably not. Some people have already pointed out that gcForest does not always perform better or significantly better than cnn and xgboost (see discussino here : https://github.com/Microsoft/LightGBM/issues/331#issuecomment-307401782 )

How much accuracy do you get on the MNIST dataset?

Song-xx commented 7 years ago

_,so...I reply the email, and it reply on the github,it looks intersing...

Song-xx commented 7 years ago

I try three times,that means I used three groups of parameters,best accuracy of gcforest on mnist is 83%..which cost 3 hours to train. for cnn,99% in 1 hour.

pylablanche commented 7 years ago

@fresh2tensorflow I am not so surprised about the accuracy if you used gcForest out of the box. Have you tried to replicate the architecture from the paper ? (i.e. 1000 trees per forest, 2 forests of each type, etc.) Actually if you could share the parameters you used that could be valuable information for future users.

Did you also build your own ConvNet or was it an existing code ?

Song-xx commented 7 years ago

thank you for your reply :) (1)the first question: I didn't do it..I will try it then (2)the second question:I use the convnet of the demo in the official tensorflow tutorial.It is a very simple convnet with just 2 conv-layer and 1 dense layer..I think 99.2% is not a bad result,so I didn't
tune the parameter to get a better result..

At 2017-06-17 18:22:54, "Pierre-Yves Lablanche" notifications@github.com wrote:

@fresh2tensorflow I am not so surprised about the accuracy if you used gcForest out of the box. Have you tried to replicate the architecture from the paper ? (i.e. 1000 trees per forest, 2 forests of each type, etc.) Actually if you could share the parameters you used that could be valuable information for future users.

Did you also build your own ConvNet or was it an existing code ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Laurae2 commented 7 years ago

@fresh2tensorflow Try to use a subsample only so you can train faster and check performance difference faster.

Example on MNIST 1k samples:

MNIST

Song-xx commented 7 years ago

@Laurae2 I get it,thank you. :)

pylablanche commented 7 years ago

Thanks @Laurae2 ! Btw have you seen this : https://arxiv.org/pdf/1705.07366.pdf

@fresh2tensorflow Would you mind sharing your results on the (mini) MNIST dataset? I'm curious to know what you get!!!

Song-xx commented 7 years ago

@pylablanche oh,I am sorry, I didn't try the algorithm on a mini MNIST.. I think it is a good idea when exploring the data.. and then I try the the gcforest with parameter(1000 trees per forest, 2 forests of each type),its accuracy didn't improve much..