lacava / few

a feature engineering wrapper for sklearn
https://lacava.github.io/few
GNU General Public License v3.0
50 stars 22 forks source link

Issues with current ML validation score #40

Open GinoWoz1 opened 5 years ago

GinoWoz1 commented 5 years ago

Hello,

Thanks for the help so far. I was able to get the tool up and running in windows.

However, 2 weird things I am observing.

1) When I use Gradient Boost Regressor - my score gets worse by the generation even when I switched the scoring function sign. The first score is nearly my best score I have gotten by myself (no feature engineering done on data set).

https://github.com/GinoWoz1/AdvancedHousePrices/blob/master/FEW_GB.ipynb

2) When I use Random Forest - same scorer - current ML validation score returns as 0 and runs really fast

https://github.com/GinoWoz1/AdvancedHousePrices/blob/master/FEW_RF.ipynb

I think I am missing something on how to use this tool but no idea what. I am trying to use this in tandem with TPOT as I am exploring feature creation GA/GP based tools. Sincerely appreciate any advice/guidance you can provide.

Sincerely, G

GinoWoz1 commented 5 years ago

Hello @lacava

Sorry for the bother , but have you had a chance to look at this ? I have been messing around with TPOT for the last 4 months and have talked to Randy Olson a few times ; he had referred me to Few and I am hoping to do a few tests with Few and Tpot over the winter . My name is Justin Joyce and currently I am exploring multiple genetic algorithm and programming methods as a masters student .

Sincerely, Justin

lacava commented 5 years ago

Hi Justin, I did look at it and ran it a couple times. It looks like there is a small bug with Few, which is that it prints out that the current ml validation score is 0 when it is not, as shown by the internal CV score that is printed.

Otherwise, this just seems to be a dataset that is not amenable to feature learning. I have found that, when paired with Gradient boosting or other high-capacity methods, it is quite difficult to find a transformation of the data that will improve the underlying ML using Few. Using Lasso, I was able to occasionally find a reduced feature space, but not one that dramatically improved the score.

You also may be interested in trying Feat, which is a more powerful version of Few that I have been working on for the last year. It has a similar sklearn interface, uses a GA to drive search, and includes neural network activation functions and backprop for learning weights. Here's the result of running that:

from feat import Feat
from sklearn.metrics import r2_score

learner = Feat(gens=1000,max_stall=100,pop_size=100,backprop=True,
               verbosity=2,
               max_dim=50,
               feature_names = ','.join(X_train.columns))
X = X_train.values
y = y_train
learner.fit(X,y)

print('final score: {}'.format(r2_score(y_train, learner.predict(X)))) 

print('model:\n',learner.get_model())

final score: 0.9004700679745707 model: Feature Weight relu(2ndFlrSF) 4575965.203086 (2ndFlrSF^2) -2993884.266219 (2ndFlrSF^3) 2704420.880597 2ndFlrSF -2653911.531777 relu(2ndFlrSF) -2328605.023701 (GrLivArea^3) -972888.992427 LotArea 829523.137047 YearBuilt 741179.594699 relu(GrLivArea) 689061.927277 relu(OverallQual) 630256.458467 relu(LotArea) -593285.483499 TotalBsmtSF 536721.385827 float(OverallCond) 408904.496879 sqrt(|YearBuilt|) 341173.116787 1stFlrSF 340342.422143 float(GarageCars) 286630.673027 OverallQual 230831.115685 (OverallQual*GarageArea) 214190.124423 BsmtFinSF1 192567.902371 (TotalBsmtSF+BsmtUnfSF) -192281.015785 relu(OverallQual) 189096.541817 float(Fireplaces) 183822.299462 GrLivArea 155897.078241 float(Condition2_Norm) 116659.048371 ScreenPorch 102781.956623 float(Neighborhood_OldTown) -100922.111480 float(HalfBath) 97458.382917

The downside is that you can't specify your own scoring_function at the moment.

lacava commented 5 years ago

When I use Gradient Boost Regressor - my score gets worse by the generation even when I switched the scoring function sign. The first score is nearly my best score I have gotten by myself (no feature engineering done on data set).

This i did not observe. I did observe that Few did not find better features, but the Internal CV stayed constant, as it should.