Closed hengzhe-zhang closed 2 years ago
We have certainly thought about incorporating symbolic classification algorithms into our benchmarking. They are a bit less common in SR literature, but nonetheless I agree such a benchmark would be very useful. I could see it being an addition to this repo.
Are there any specific plans with respect to this matter? In my opinion, it seems all analysis scripts can be reused, and we only need to change experimental datasets to those classification datasets in PMLB database, and change those machine learning estimators to its classification counterpart.
By the way, I'm not sure about whether we should reuse the results reported in the previous large-scale benchmark paper. The classifiers used in that paper are rather old, and it doesn't include SOTA classifiers such as XGBoost and LightGBM. Consequently, it is questionable if it is necessary to use the existing results of that article. And even worse, some papers pointed out that results are obtained under a flawed experimental protocol [1], e.g., that paper uses test data to tune the hyper-parameter. Consequently, the results obtained by that article are not reliable.
[1]. Wainberg M, Alipanahi B, Frey B J. Are random forests truly the best classifiers?[J]. The Journal of Machine Learning Research, 2016, 17(1): 3837-3841.
I don't think we currently plan to work on a large scale benchmarking of classifiers with specific focus on GP-based ones. Please notice that the datasets included in srbench are regression problems, PMLB covers over 165 classification problems as well. Nonetheless, you might be interested in taking a look at the following papers that cover more recent methods: https://biodatamining.biomedcentral.com/articles/10.1186/s13040-017-0154-4 https://www.worldscientific.com/doi/pdf/10.1142/9789813235533_0018 https://arxiv.org/abs/2107.06475
@athril Thank you for providing the DIGEN package. This is exactly what I am looking for, excellent work!
In 2014, a paper published in JMLR reported the results of more than 100+ classification algorithms on numerous classification benchmark datasets [1]. However, it seems that such a paper does not consider the genetic programming based methods, such as M4GP [2]. Consequently, is it possible to develop a classification benchmark to further boost the advancement of genetic programming and even the machine learning domain?
[1]. Fernández-Delgado M, Cernadas E, Barro S, et al. Do we need hundreds of classifiers to solve real world classification problems?[J]. The journal of machine learning research, 2014, 15(1): 3133-3181. [2]. La Cava W, Silva S, Danai K, et al. Multidimensional genetic programming for multiclass classification[J]. Swarm and evolutionary computation, 2019, 44: 260-272.