Closed HeuristicLab-Trac-Bot closed 7 years ago
r14232 to r14233 : created a feature branch for #2650 (support for categorical variables in symb reg) with a first set of changes work in progress...
TODO:
handle correctly in all formatters(Smalltalk formatter and external evaluation formatter have not been adjusted)view for factor variables (configuration of actually allowed factors)create a set of unit tests for the simplifier (handle correctly in simplifier)extend simplifier to handle BinaryFactorVariableextend simplifier to combine FactorVariables with BinaryFactorVariablehandle correctly in variable impacts viewhandle correctly in Non-linear regression (infix parser and infix formatter)support in all analyzers which handle variable symbols specificallysupport for pruningsymbol for WeightedFactorVariable (instead of only 0/1)add an interface for variable symbols (with VariableName property)handle correctly in gradient viewshandle correctly in mathematical expression viewhandle correctly in ERC view (create linear regression model)handle correctly in symbolic classification - solution comparison- ~~handle correctly in OneR ~~
Open issues which are not strictly necessary for a first merge of the functionality:
- support string variables in data preprocessing view
- allow factor variables in decision trees (and therefore GBT)?
- allow string variables as target variables in classification algorithms
- Switch/Case symbol with one subtree for each possible factor value
- handle correctly in SymbolicDataAnalysisExpressionTreeILEmittingInterpreter and SymbolicDataAnalysisExpressionCompiledTreeInterpreter (done: tree and linear interpreter)
- support in more algs?
- added weight for FactorVariable (necessary for LR)
- introduced VariableBase and VariableTreeNodeBase and IVariableSymbol
- support for factors in LR
- extended variable impacts in solution view
- fixed ERC view for regression
- support for FactorVariable in simplifier
- improved support for FactorVariable in constants optimizer
- multiple related changes and small fixes
Shouldn't the variable impacts view be added as a solution view instead of an extra button?
r14240: added support for categorical variables to LDA and MNL
r14241: added support for factor variables in specific solution comparison view for symbolic classification solutions
r14242: added support for factor variables to OneR algorithm
r14248: added support for factor variables to target variation view together with Philipp
r14249: added new symbol FactorVariable (renamed previous symbol to BinaryFactorVariable) Work in progress.
- extended non-linear regression to work with factors
- fixed bugs in constants optimizer and tree interpreter
- improved simplification of factor variables
- added support for factors to ERC view
- added support for factors to solution comparison view
- activated view for all factors
r14259: added support for factor variables to Excel formatter and Excel exporter as well as to the Latex formatter and consequently the mathematical representation view.
r14266: improved handling of factors in ConstantOptimizationEvaluator (create binary indicators only once)
Bugs:
Exception when showing the simplifier view after simplification of the tree (it seems some nodes are not cloned).(r14339)Exception when trying to open data preprocessing view for a ProblemData object stored in a solution(#2683)
r14402: fixed a bug in constant optimizer in relation to lagged variables
Should be finished before #2697
Issue migrated from trac ticket # 2650
milestone: HeuristicLab 3.3.15 | component: Problems.DataAnalysis.Symbolic | priority: medium | resolution: done
2016-08-03 18:10:37: @gkronber created the issue