heal-research / HeuristicLab

HeuristicLab - An environment for heuristic and evolutionary optimization
https://dev.heuristiclab.com
GNU General Public License v3.0
39 stars 16 forks source link

Implement Naive Grammar Enumeration for Symb. Regression #2886

Open HeuristicLab-Trac-Bot opened 6 years ago

HeuristicLab-Trac-Bot commented 6 years ago

Issue migrated from trac ticket # 2886

component: Algorithms.DataAnalysis | priority: medium

2018-01-30 16:48:06: @LukasCamera created the issue


As a first step in deterministic symbolic regression, implement an algorithm that iterates and checks all possible sentences of a grammar in order to find the best model structure for (very simple) regression problems.

HeuristicLab-Trac-Bot commented 6 years ago

2018-01-30 16:48:16: @LukasCamera changed status from new to accepted

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-13 15:37:19: @LukasCamera commented


r15712: Add basic class structure, grammar and grammar iteration.

r15714: Add tree hashing for addition and multiplication.

r15722: Add evaluation of sentences.

r15723: Add simple data analysis tests and further informations about algorithm run.

r15724: Add parsing to infix form for debugging purpose.

r15725: Refactor tree hash function.

r15726: Overwrite long sentences when a shorter one with same hash was found.

r15734: worked on grammar enumeration

r15746: Refactor grammar enumeration alg.

r15765: Add graphviz output.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-13 18:33:38: @LukasCamera commented


r15772: Extend grammar enumeration algorithm's grammar to exp, log and sine.

r15773: Update unit tests to cover problems with exp, log and sine.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-14 10:59:22: @LukasCamera commented


r15776: Refactor data generation in unit tests.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-16 18:38:12: @LukasCamera commented


r15784: Add basic implementation for inverse factors.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-22 13:46:26: @gkronber commented


r15806: made a few comments

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-23 18:19:19: @LukasCamera commented


r15791: Replace integer hashing of phrases with simplification to (temporary) string representations.

r15795: Remove nested divisions from grammar and hashing.

r15800: Refactor code and fix performance issues.

r15803: Deactivate generation of dot file for visualizing search tree.

r15812: Performance Improvements - Only store hash of archived phrases and reduce number of enumerators.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-26 16:14:41: @LukasCamera commented


r15817: Add cosine to grammar.

HeuristicLab-Trac-Bot commented 6 years ago

2018-02-28 19:11:56: @LukasCamera commented


r15821: Move code for visualization and logging of sentences to separate classes.

r15823: Fixed build settings.

r15824: Move R² calculation of sentences to separate class and allow its deactivation.

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-08 09:22:52: @LukasCamera commented


r15825: Added prebuild event.

r15827: Change implementation of symbol strings from list to array.

r15828: Implement IEquatable interface in symbols. Minor performance improvements.

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-09 14:36:15: @LukasCamera commented


r15832: Fix Equals methods in Symbols, Move semantical hashing of phrases to separate class.

r15834: Store production rules in grammar instead of nonterminal symbols.

r15835: Split huge hashing function into smaller ones.

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-14 11:43:59: @gkronber commented


r15840 added utility console program for clustering of expressions (work in progress)

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-15 07:09:28: @gkronber commented


r15841: fixed FLANNParameters structure

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-15 10:41:34: @gkronber commented


r15842: added clustering of functions and output of clusters, fixed bug in evaluation

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-20 17:52:16: @LukasCamera commented


r15843: Remove duplicates in logged sentences using bash commands.

r15849: Add constants to grammar.

r15850: Remove cosine from grammar.

r15851: Remove cosine terminal symbols.

HeuristicLab-Trac-Bot commented 6 years ago

2018-03-23 18:59:22: @LukasCamera commented


r15860: Change complexity measure from number of nodes in tree to number of variable references.

r15861: Make constant optimization toggleable in algorithm.

HeuristicLab-Trac-Bot commented 6 years ago

2018-04-04 16:25:18: @LukasCamera commented


r15883: Priorize phrases whose (fully expanded) terms result in high R².

HeuristicLab-Trac-Bot commented 6 years ago

2018-04-17 18:50:36: @LukasCamera commented


r15903: worked on cluster analysis / visualization for GPTP

r15907: Changes in search heuristic for solving Poly-10 problem. Adapt tree evaluation to cover non-terminal symbols.

r15910: Fix length parameter when priorizing phrases and add weighting parameter to control exploration/exploitation during search, fix copy constructors in Analyzers.

HeuristicLab-Trac-Bot commented 6 years ago

2018-04-18 10:23:56: @gkronber commented


r15911: Changed initialization of SentenceLogger because started event is not called after Run(). Logging to GZipStream.

HeuristicLab-Trac-Bot commented 6 years ago

2018-04-30 20:25:27: @gkronber commented


r15924: remove obsolete code in C# program for the evaluation of sentences, switch to NSME as quality measure. Tried plotting functions within clusters in R

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-05 14:33:21: @foolnotion commented


r15949: Fix serialization (saving the algorithm).

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-05 14:35:16: @foolnotion commented


r15950: Try to use variable importance information (from a random forest) to guide the search.

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-13 14:41:54: @foolnotion commented


r15957: Minor refactor; fix multiple analyzer event registration

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-15 14:54:03: @foolnotion commented


r15960: Fix serialization and cloning and plugin properties.

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-18 14:32:06: @foolnotion commented


r15963: Add missing storable constructors

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-19 10:58:12: @foolnotion commented


r15965: Improve hashing performance (about 10% measured improvement)

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-28 11:37:20: @foolnotion commented


r15974:

  • implement LRU cache for storing search nodes (needs better integration with the algorithm main loop)
  • introduce SortedSet for handling priorities (better memory usage, possibility to remove bad priorities, slight performance penalty)
  • fix serialization and cloning
HeuristicLab-Trac-Bot commented 6 years ago

2018-06-28 16:57:45: @foolnotion commented


r15975: address additional serialization issues, make Production implement IList<T> (instead of deriving from List<T>)

HeuristicLab-Trac-Bot commented 6 years ago

2018-06-30 12:54:31: @foolnotion commented


r15977: Clear search data structures at the end of the run (huge memory savings)

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-03 10:55:57: @foolnotion commented


r15979: Register algorithm events after deserialization.

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-03 17:07:28: @LukasCamera commented


r15981: Refactor properties to comply with .NET 4.5.2

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-03 18:40:51: @foolnotion commented


r15982: Add storable constructors for analyzers

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-05 15:33:38: @foolnotion commented


r15985: Simplify code in RSquaredEvaluator. Turn on linear scaling for the constant optimization evaluator.

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-05 17:52:52: @foolnotion commented


r15987: Make sure to clear search data structures before returning in GrammarEnumerationAlgorithm.OnStopped()

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-11 10:36:57: @foolnotion commented


r15993: Refactor code

  • move a few methods to the Grammar class
  • use a plain dictionary for storing search nodes in the SearchDataStore (instead of LRU cache)
  • make it easier to keep a consistent state between the algorithm and the evaluator (optimize constants flag)
  • track trajectories in quality/length space for best solutions
  • remove variable importances for now
HeuristicLab-Trac-Bot commented 6 years ago

2018-07-11 16:44:05: @foolnotion commented


r15994: Add symbolic regression solution to results during algorithm run and scale model.

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-26 12:40:29: @foolnotion commented


r16019: Fix properties lacking implementation in Production.

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-26 13:17:58: @foolnotion commented


r16022: Remove MaxSentenceLength from priority calculation for the time being.

HeuristicLab-Trac-Bot commented 6 years ago

2018-07-27 19:29:19: @foolnotion commented


r16026:

  • replace functionally-overlapping classes Production and SymbolString with a single class SymbolList
  • refactor methods from Grammar class as methods and properties of SymbolList
  • add parameter for the number of constant optimization iterations
  • refactor code
HeuristicLab-Trac-Bot commented 6 years ago

2018-08-06 16:54:33: @foolnotion commented


r16053: Refactor RSquaredEvaluator as a standalone ParameterizedNamedItem which is a parameter of the algorithm. Implement BestSolutionAnalyzer analyzer for quality statistics. Add license headers where missing.

HeuristicLab-Trac-Bot commented 6 years ago

2018-08-06 18:00:06: @foolnotion commented


r16056: Remove redundant EvaluatePhrase method in the Grammar class and fix compilation of tests.

HeuristicLab-Trac-Bot commented 6 years ago

2018-08-13 08:59:27: @foolnotion commented


r16073: Implement restarts for constant optimization in the RSquaredEvaluator

HeuristicLab-Trac-Bot commented 6 years ago

2018-08-27 19:10:48: @LukasCamera commented


r16088: Store pareto-optimal sentences (quality/complexity) to grammar enumeration.

HeuristicLab-Trac-Bot commented 6 years ago

2018-08-28 11:29:01: @LukasCamera commented


r16090: Explicitely store all pareto-optimal RegressionSolution objects at the end of the algorithm.

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-17 16:37:30: @gkronber commented


r16151: deleted obsolete files

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-20 11:13:33: @foolnotion commented


r16157:

  • Update IGrammarEnumerationEvaluator interface (add Evaluate method accepting an ISymbolicExpressionTree for the case when the constants have already been optimized in the tree, add boolean OptimizeConstants flag),

  • small refactor in GrammarEnumeration/GrammarEnumerationAlgorithm.cs

  • add unit tests

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-20 13:19:02: @foolnotion commented


r16159: Refactor unit test using only C# 4.5 features.

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-22 21:58:48: @foolnotion commented


r16176: Fix hashing

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-28 15:24:20: @foolnotion commented


r16193: Implement new hasher (faster & less collision prone) and update unit tests.

r16194: Fix compilation errors in test :(

HeuristicLab-Trac-Bot commented 6 years ago

2018-09-28 15:24:20: @foolnotion