ai-se / magic101

1 stars 0 forks source link

Comparing DE/GA/NSGA-II/MOEAD #8

Open arennax opened 6 years ago

arennax commented 6 years ago

9 datasets, 3-fold cross validation, pop=50, gen=100, repeats=20

Experiments

For DE: (NP = 50, F = 1, CR = 0.5, life = 5)

  1. Separate the data into train-part and test-part.

  2. (Gen 0) Randomly generate 50 config (after constraints check), for each config[i] (i=1~50), calculate its mMRE (median MRE) on train-part.

  3. (Gen 1~N) Use DE to generate 50 new config from precious Gen, and calculate their mMRE on train-part. For each config[i], if new config[i]'s mMRE is less than old config[i], use new config[i] to replace old config[i].

    Stop rules:

    1. reach Gen 100;
    2. reach Count = 5 (life); (initailly count=0, for each time that later gen's median mMRE >= former gen's least mMRE, Count += 1).
  4. Use config with least mMRE in Gen N, calculate its mMRE on test-part.

  5. Since 20 repeats and 3-fold, we got 20*3 = 60 mMRE values for each dataset.

For GA: (NP = 50, CX = 0.6, MUT = 0.1, life = 5)

  1. Separate the data into train-part and test-part.

  2. (Gen 0) Randomly generate 50 config (after constraints check), for each config[i] (i=1~50), calculate its mMRE (median MRE) on train-part.

  3. (Gen 1~N) Use GA to generate 50 new config from precious Gen, and calculate their mMRE on train-part.

    Stop rules:

    1. reach Gen 100;
    2. reach Count = 5 (life); (initailly count=0, for each time that later gen's median mMRE >= former gen's least mMRE, Count += 1).
  4. Use config with least mMRE in Gen N, calculate its mMRE on test-part.

  5. Since 20 repeats and 3-fold, we got 20*3 = 60 mMRE values for each dataset.

Current Results (between ATLM, DE and GA):

samre

A sorted graph between DE250 and GA250 in isbg10 dataset:

samre

Runtime GA vs DE:

run

Number of Gen Comparison (between DE and GA):

ngen

Next Task

  1. Add MOEA/D

  2. Try NSGA-II with adjusted modification

  3. Use DE/GA to tune CART

  4. More literature review for potential paths

  5. Update current OIL with uniform frameworks (DEAP/PyGMO..)

To Do

  1. Re-construct OIL architecture (sklearn/utils/model/optimizer)

  2. pip install package

  3. Tutorial Materials (workshop to REU students)

  4. Reverse negative results (Negative Results for Software Effort Estimation, 2016)

timm commented 6 years ago

-looking sane

questions:

  1. is china not here cause of long runtime?
  2. why NP=50? engineering judgement since NP=100 was too slow?

Todo (in suggested order or priority, first to last):

Ideas:

arennax commented 6 years ago

Yes, I decide to use NP=50 to get the initial results sooner since 100 was too slow. Will add china and runtime

timm commented 6 years ago

when will you add china and runtimes?

timm commented 6 years ago

for our own GA, NP=50 is arguable but it could be said that at NP=100 we will beat DE much more often

in any case, when you do nsga-II and moea/D make sure you use their defaults. and if that is NP=100, then so be it.

but keep lives=5

arennax commented 6 years ago

I am running china now so can get result today, same as runtimes. roger for the defaults

arennax commented 6 years ago

The first batch of our comparasion (Default ABE0, ATLM, CART and Sarro's CoGEE):

c

The second batch of our comparasion (ABEN tuned by GA, DE, MOEA/D):

a

For DE tuning, we use 2 variants: DE10 and DE30. DE30 follows the rule that #np=#decision*5; DE10 uses the fixed number 10 as population size follows Wei's work.

For bi-objective methods, the two objectives are: 1. minimize MRE; 2. Minimize the Con fidence Interval associated to MRE

untitled diagram