Closed HeuristicLab-Trac-Bot closed 12 years ago
I looked a bit deeper into the issue examining the size of the serialized xml of a single run.
I took one run of the GA-TSP sample where Coordinates, DistanceMatrix, and BestSolution have not been collected. The run is copied to a run collection and saved. Then the operators are removed by a runcollection modifier, the modifier is removed and the run is saved again. These two are then compared:
- Including the operators has a total line count of 2029 compared to just 636 when operators are removed
- The Analyzer operator needs 1080 lines which is 53% of the whole file
- in the Analyzer the biggest part is the BestAverageWorstQualityAnalyzer which has 32% of all the lines
- Selector, Crossover, Manipulator, SolutionCreator, Evaluator take together 17% of all lines
- So operators are contributing about 70% to the serialized xml size of this single run
However, the contribution to file size varies greatly between the operators, the range was [40;120] for the non-analyzer operators. A result of a single string uses just 4 lines, so there's quite some saving potential. I did not look into time differences.
Unfortunately it's not as easy to implement as I thought.
CollectParameterValues
is implemented inParameterizedNamedItem
which resides in HeuristicLab.Core. To add the name of an operator instead of the whole operator would require to create a newStringValue
with the name, which in turn is a type of HeuristicLab.Data.We could
- reimplement the functionality of
CollectParameterValues
anew inAlgorithm
,Problem
andOperator
and possible other types that areParameterizedNamedItems
and that can contain operators.- add a method
IItem GetCollectedValue()
toIParameterizedItem
or evenIItem
which returnsthis
by default and which is overriden inOperator
to returnnew StringValue(Name)
- delay the ticket and merge Core and Data
Other options?
Thanks abeham for your thoughts on this issue. We should spend some more time on discussing how to implement this. Therefore I move this ticket to 3.3.7 for now.
r7579: I solved this issue now. I found that
CollectParameterValues
was too monolithic in that you have to overwrite and re-implement the same method again if you wanted to change just a detail (in that case that operators are stored by their name). So I splitCollectParameterValues
into two separate logical parts:
CollectParameterValues
is iterating over the parametersGetCollectedValues
decides what values are collected from the given parameter value
Algorithm
andProblem
now overwrite onlyGetCollectedValues
, but reuse the implementation of the base class in that they only filter the values. When they see anIOperator
they will instead convert it to its name. UsingIEnumerable
andyield
I think that's a nice solution.
Please forward this to swagner when you've found no further issues.
I found a bug where parameters of the operators were not added to the run if the operator itself was not added to the run.
Thanks for implementing this enhancement.
Issue migrated from trac ticket # 1695
milestone: HeuristicLab 3.3.7 | component: Optimization | priority: medium | resolution: done
2011-12-05 14:53:25: @abeham created the issue