JJ / 2021-cec-deep-g-prop

Deep-G-Prop for the new edition of CEC in 2021
GNU General Public License v3.0
1 stars 1 forks source link

Comments by second reviewer #38

Closed JJ closed 3 years ago

JJ commented 3 years ago

The paper "EvoMLP: Evolving the architecture of multi-layer perceptrons using free software" presents a framework, called EvoMLP, to evolve the architecture of neural networks using evolutionary algorithms.

I have an important concern regarding this paper. Neuroevolution, the topic of the paper, has been there for decades, with many contributions along the years. In particular, there is nowadays renewed interest in Neuroevolution and many papers on it. From the scientific point of view, I am not able to see the contribtuion of this paper. The ideas in the EvoMLP are not new (the paper does not claim that) and, thus, I don't find any scienficit contribution. Perhaps the contribution is on the software platform. But the paper is not written focused on the software platform. For example, it does not list the competitors neuroevolution software platforms with advantages or drawbacks (e.g., Darwin: https://github.com/tlemo/darwin). The only comparison is with G-Prop, and algorithm which is 20 years old, and EvoMLP is not always better than G-Prop. Thus, I am not able to see what is the contribution of this piece of work. I think the contribution should be clear in the paper.

In the experiments only 5 independent runs were used (the seeds are provided in page 6). This is a low number for a stochastic algorithm. The standard is to use 30 or more.

In page 13 the paper says that EvoMLP requires 5 times more budget than G-Prop, but this extra time is lower than the one of G-Prop in our current computers. However, G-Prop should also rune faster in current computers, so I guess the different in runtime is kept the same: 5 times. Which is a significant difference, specially if the difference in quality of the soutions is not significant. Thus, I do not see any significant improvement of EvoMLP over G-Prop.

I miss the results of statistical tests to check if the observed differences are significant or not. I think that 5 runs can be a low number to detected differences in most of the cases, but the test should be applied anyway.

Minor issues