yinlou / mltk

Machine Learning Tool Kit
BSD 3-Clause "New" or "Revised" License
136 stars 74 forks source link

How to run GA2M with FAST? #22

Open dfrankow opened 5 years ago

dfrankow commented 5 years ago

I've been reading the docs trying to create an example of classification with GA2MLearner using command-line tools.

I checked in https://github.com/dfrankow/mltk/tree/master/examples with train_ga2m.sh. You should be able to check it out and run it. If I can get it to work, I'm happy to pass it back as an example, as requested in #17.

Several questions:

dfrankow commented 5 years ago

@sds-dubois - any suggestions? It looks like you used GA2Ms in #7.

yinlou commented 5 years ago

Here is the script to get GA2M running from end to end. You might find mltk.predictor.evaluation.Predictor and mltk.predictor.gam.interaction.FAST useful. I will update wiki soon.

MLTK=/Users/yin_lou/repos/mltk-github/mltk/target/mltk-0.1.0-SNAPSHOT.jar

java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-t cal_housing.train.all \
-m cal_housing_binned.attr \
-i cal_housing.train.all \
-o cal_housing_binned.train.all

java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.train \
-o cal_housing_binned.train

java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.valid \
-o cal_housing_binned.valid

java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.test \
-o cal_housing_binned.test

java -Xmx4g -cp $MLTK mltk.predictor.gam.GAMLearner \
-r cal_housing_binned.attr \
-t cal_housing_binned.train \
-v cal_housing_binned.valid \
-m 1000 \
-l 1 \
-o gam.model

java -Xmx4g -cp $MLTK mltk.predictor.evaluation.Predictor \
-r cal_housing_binned.attr \
-d cal_housing_binned.train.all \
-m gam.model \
-g r \
-R cal_housing_residual.txt

java -Xmx4g -cp $MLTK mltk.predictor.gam.interaction.FAST \
-r cal_housing_binned.attr \
-d cal_housing_binned.train.all \
-R cal_housing_residual.txt \
-o pairs.txt

java -Xmx4g -cp $MLTK mltk.predictor.gam.GA2MLearner \
-r cal_housing_binned.attr \
-t cal_housing_binned.train \
-v cal_housing_binned.valid \
-I pairs.txt \
-m 100 \
-i gam.model \
-o ga2m.model