heal-research / HeuristicLab

HeuristicLab - An environment for heuristic and evolutionary optimization
https://dev.heuristiclab.com
GNU General Public License v3.0
39 stars 16 forks source link

Improve the SymbolicDataAnalysisExpressionPruningOperator #2359

Closed HeuristicLab-Trac-Bot closed 9 years ago

HeuristicLab-Trac-Bot commented 9 years ago

Issue migrated from trac ticket # 2359

milestone: HeuristicLab 3.3.12 | component: Problems.DataAnalysis.Symbolic | priority: medium | resolution: done | keywords: pruning, symbolic data analysis, classification, regression

2015-03-11 11:43:54: @foolnotion created the issue


Some minor things need to be improved:

  • provide static Prune methods that return a simplified tree (for classification and regression)
  • set the impact values calculator in the constructor for the derived classes
HeuristicLab-Trac-Bot commented 9 years ago

2015-03-11 11:52:16: @foolnotion changed status from new to accepted

HeuristicLab-Trac-Bot commented 9 years ago

2015-03-11 11:52:16: @foolnotion changed title from Improve the SymbolicDataAnalysisExpressionPruningOperator to Improve the SymbolicDataAnalysisExpressionPruningOperator

HeuristicLab-Trac-Bot commented 9 years ago

2015-03-11 14:08:30: @foolnotion commented


r12189: Implemented improvements

HeuristicLab-Trac-Bot commented 9 years ago

2015-03-11 14:14:36: @foolnotion changed status from accepted to reviewing

HeuristicLab-Trac-Bot commented 9 years ago

2015-03-11 14:14:36: @foolnotion changed owner from @foolnotion to @mkommend

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 11:57:01: @mkommend commented


r12358: Refactored pruning operators and analyzers.

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 11:59:31: @mkommend changed status from reviewing to assigned

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 11:59:31: @mkommend changed owner from @mkommend to @foolnotion

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 11:59:31: @mkommend commented


r12359: Removed commented code from pruning analyzer.

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 11:59:55: @mkommend commented


Please add the number of removed nodes as an additional data series in the analyzer's data table. Furthermore, have a detailed look at the changes in r12358 and review them.

HeuristicLab-Trac-Bot commented 9 years ago

2015-04-29 15:20:15: @foolnotion commented


r12361: The changes in r12358 look fine to me. Added total number of pruned nodes in the analyzer's data table. Removed unused parameter names in the SymbolicDataAnalysisSingleObjectivePruningAnalyzer.

HeuristicLab-Trac-Bot commented 9 years ago

2015-06-22 11:50:34: @foolnotion changed status from assigned to reviewing

HeuristicLab-Trac-Bot commented 9 years ago

2015-06-22 11:50:34: @foolnotion changed owner from @foolnotion to @mkommend

HeuristicLab-Trac-Bot commented 9 years ago

2015-06-23 15:36:00: @Shabbafru changed owner from @mkommend to @gkronber

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-07 13:07:46: @gkronber commented


2398 depends on this ticket

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-08 11:01:48: @gkronber commented


SymbolicDataAnalysisExpressionPruningOperator.Apply() produces incorrect quality values.

The problem is two-fold. (1) It is assumed that impacts of replacements are additive concerning the quality value. First the original quality is retrieved. In the loop over all nodes impacts are calculated repeatedly and each time node is pruned the quality is reduced by the impact value (quality -= impactValue).

(2) Impact calculators use accuracy (classification) or R² (regression) to calculate the impacts. However, the evaluation operator from the problem can be different (such as MSE or absolute error) therefore we cannot simply subtract the impact from the quality.

Proposed solution: completely re-evaluate pruned models with the evaluation operator from the problem.

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-08 11:02:34: @gkronber commented


r12674: use stable sort in pruning analyzer.

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-08 11:02:45: @gkronber changed status from reviewing to assigned

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-08 11:02:45: @gkronber changed owner from @gkronber to @foolnotion

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-09 11:38:38: @foolnotion commented


  • Regarding (1), the CalculateImpactsAndReplacementValues uses internally the PearsonsRSquared measure (for regression) and the accuracy measure (for classification) to calculate impacts, which is exactly what the SymbolicDataAnalysisPruningOperator.Evaluate method provides. Providing an originalQuality simply avoids recalculating it inside the metohd on each call. Since the impact is actually calculated as impactValue = originalQuality - newQuality, within the for loop the new originalQuality can be calculated as quality -= impactValue, which helps speed things up between successive calls. The confusion lies here in the terminology: the originalQuality accepted by the CalculateImpactsAndReplacementValues has no connection to the actual quality of the indivudal (which can be MSE, absolute error, etc).

  • (2) is indeed a problem, as the quality should not be updated that way. The problem is the line QualityParameter.ActualValue.Value = quality where as you pointed out, we cannot assume anything about the evaluation operator from the problem and which kind of quality measure it provides. Therefore, the solution should indeed be to completely re-evaluate pruned models with the evaluation operator from the problem.

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-10 15:41:22: @foolnotion commented


r12720: Changed the impact calculators so that the quality value necessary for impacts calculation is calculated with a separate method. Refactored the CalculateImpactAndReplacementValues method to return the new quality in an out-parameter (adjusted method signature in interface accordingly). Added Evaluate method to the regression and classification pruning operators that re-evaluates the tree using the problem evaluator after pruning was performed.

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-10 16:06:08: @foolnotion changed status from assigned to reviewing

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-10 16:06:08: @foolnotion changed owner from @foolnotion to @gkronber

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-11 20:30:03: @gkronber commented


Reviewed all changes and found out that the pruning operators are not backwards compatible because parameters where added/removed/type-changed..

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-11 20:30:03: @gkronber

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 10:28:05: @gkronber changed status from reviewing to assigned

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 10:28:12: @gkronber changed status from assigned to accepted

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:02:07: @gkronber commented


r12744: added after-deserialization code for backwards-compatibility

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:23:51: @gkronber commented


r12745: (combined stable merge #2398) merged r12189, r12358, r12359, r12361, r12461, r12674, r12720, r12744 from trunk to stable

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:23:51: @gkronber

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:26:54: @gkronber changed status from accepted to reviewing

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:27:09: @gkronber changed status from reviewing to readytorelease

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:27:16: @gkronber changed status from readytorelease to closed

HeuristicLab-Trac-Bot commented 9 years ago

2015-07-12 11:27:16: @gkronber removed resolution