ai-se / storm

MOO
0 stars 3 forks source link

Joe's Chart on GALE2 #25

Open vivekaxl opened 8 years ago

vivekaxl commented 8 years ago

My comment on these charts are: Can't really say anything about how good or bad GALE2 is. @timm @Ginfung What do you think?

vivekaxl commented 8 years ago

Another idea is to tune how many points should be mutated or generated. For these results 25% of the population is mutated. We can try working with different ratios.

WeiFoo commented 8 years ago

can you make your legend consistent with algorithms?

vivekaxl commented 8 years ago

Old Results: #27

vivekaxl commented 8 years ago

@WeiFoo As requested. Great suggestion! Thanks

WeiFoo commented 8 years ago

It's better and much clear now.

my read of these pics is no significant difference in GALE and GALE2, except for GALE seems better than GALE2 in EIS, cellpone, web_portal, eshop, in terms of spread.

suggestions for your next expriment if you have, but now it's totally fine:

keep your order of x-axis, legend, and algorithms in for each data set, consistent for example,

timm commented 8 years ago

lose numbers? runtimes?

evals?

timm commented 8 years ago

for the spread and hypervolume stuff, are these means or medians (want medians).

and what is the IQR?

and for the product line stuff, is this with 5 goals?

timm commented 8 years ago

And the loss values?

timm commented 8 years ago

hey @vivekaxl, Is @Ginfung smiling at the nasty runtimes in the feature models. NOW he has something to try his constraint propagation tricks on

hey @Ginfung:

timm commented 8 years ago

hey @vivekaxl please confirm: LESS spread is better and MORE hypervolume is worse?

timm commented 8 years ago

hey @vivekaxl looks like were going to have to defend GALE/GALE2 on the basis of "darn fast, results not too bad".

for each model:
   for each objective:      
         let pop1 be all the objective scores in baseline across all N repeats with  sd of s
         let a small effect be s*0.4 (see https://goo.gl/9t7wYH)
         let pop2 be all the objective scores in last generation across all N repeats with mean of n1
         let pop3 be GALE2's  objective scores in last generation across all N repeats with mean of n2
         generate a table show the difference in the n1-n2.
              if n1-n2 is less than a small effect, show n1-n2 in gray
              otherwise, show in black

add a summary table counting up the percent gray (where GALE2 was no worse than a small effect different to another optimizer)

ginfung commented 8 years ago

@vivekaxl runtime for EIS/eshop, GALE>>DE ? This confused me. Other tendency are the same as I did.

ginfung commented 8 years ago

@timm Linux kernel stuff might be a little bit tricky, since all information for them are the constraints, no explicit tree-structure, i.e. no feature pruning directly.

timm commented 8 years ago

@Ginfung good comment. how did abdel handle that in his ase'13 paper?

t

ginfung commented 8 years ago

@timm abdel handled that by: 1)set constraint violations as an objective 2) set up all features as decision 3) reduce the decision space by deleting the fixed features 4) add one correct "feature-rich" candidate into the initial population. In a nutshell, avoid the tree structure.

vivekaxl commented 8 years ago

_DONT LOOK AT THE RESULTS! BUG FOUND!_

for the spread and hypervolume stuff, are these means or medians (want medians).

In the hypervolume graphs,

hv_gen_1 = median(hv_gen_1_repeat_1, hv_gen_1_repeat_2, .....hv_gen_1_repeat_m)
hv_overall = median(hv_gen_1, hv_gen_2, hv_gen_3.....hv_gen_n)
where m = number of repeats, n = number of generations 

New Results: #28

and what is the IQR?

I don't have the results as of now. Would update it

and for the product line stuff, is this with 5 goals?

I am running it for 3 goals as

  • number of features (minimization)
  • constraints violated (minimization)
  • cost (minimization)

please confirm: LESS spread is better and MORE hypervolume is worse?

For spread: less is better and For hypervolume: more is better