ai-se / ResourcesDataDrivenSBSE

Other
4 stars 0 forks source link

advice for dd-sbse #20

Closed timm closed 6 years ago

timm commented 6 years ago

Guys?

what other advice to offer than the following?

t


Learning:

Read, a lot

Debugging:

Selecting comparions algorithms:

for that baseline we recommend either

don't publish one result, instead repeat your analysis 20 to 30 times using different random number seeds each time.

take great care with random number seeds

as to statistical methods, our results are often heavily skewed so don't use anything that assumes symmetrical gaussians (so no t-tests).

Do do reproduction packages

Whatever you do

vivekaxl commented 6 years ago

I am not sure if are aware of this resource: https://github.com/dspinellis/awesome-msr

timm commented 6 years ago

so should we be an "awesome" repo?

the following github organizations are available:

which means we could move the repo to awesome-ddsbse/resources or awesome-data/sbse

vivek- please read the awesome list naming guidelines. would any of the above satisfy their criteria? if not, what?

minkull commented 6 years ago

I normally recommend at least 30 reps, rather than between 20 and 30. Cliff's d and A12 have a linear relationship and can be computed from each other -- maybe worth mentioning that any of these can be used. It may also be worth to mention that ppl can implement their new methods within existing toolboxes, which can be more easily used by other people,e.g., JMetal, Opt4j, etc.

markuswagnergithub commented 6 years ago

Re "toolboxes": Jerry Swann had put together a Java framework that is supposed to help researchers perform test: https://github.com/JerrySwan/Astraiea (my force is not too strong with tests, I use the Wilcoxon U normally) --> the readme there might say everything (I guess it is bad style to copy everything over into this text box), e.g. it supports the external "generation of data" (read: to call programs to produce numbers), and it does "Wilcoxon U significance + Vargha Delaney effect size + confidence intervals".

markuswagnergithub commented 6 years ago

binary/indicator dominator: you mean Pareto dominance? Use it for d=2, maybe d=3. The usefulness of Pareto dominance drop exponentially as the number of dimensions increases. Either use something better, or something like epsilon-dominance (based on beer cans and table tennis balls). I do agree that is can be coded up easily, the correctness can be checked easily (e.g. visually) and that d=2/d=3 is often enough.

markuswagnergithub commented 6 years ago

1+two+new+dumb:

I like this general recommendation. Especially "new" should become mandatory.

I'd like to throw my AGE into the round, for problems with many objectives. Can provide justification, and the basic algorithm that uses the idea of additive/multiplicative approximation (sth theoreticians are happy to aim for) be coded up reasonably quickly, too.

markuswagnergithub commented 6 years ago

Strong support for "publish code" and "publish rand procedure". Hey, this almost sounds like reproducible work then!