PastelBelem8 / ADOPT.jl

This is the result of my master thesis on Multi-Objective Optimization. This repository is more focused towards Pareto-based optimization rather than SIngle-Objective optimization with preference articulation. We focus on time-consuming optimization routines and, as a result we focus on model-based methods to allow for faster convergence times. This is relevant for Architectural Design Optimization, which depends on time-intensive simulations (e.g. minutes, hours or even days to complete a single simulation).
GNU General Public License v3.0
0 stars 0 forks source link

Define the processing information necessary throughout a typical optimization process #5

Open PastelBelem8 opened 6 years ago

PastelBelem8 commented 6 years ago

When performing optimization, we collect different measurements. A careful review over the involved metrics and the necessary information should be considered.

This should be documented properly.

For example, important information to collect is

  • the time it took an evaluation, thus allowing to monitor problems in the simulator/model or even to provide feedback that the process is running.
  • the set of parameters' values that generated a specific design. ...
PastelBelem8 commented 6 years ago

When structuring an optimization workflow there are some indicators that should be retrieved to guarantee that the optimization process is evolving appropriately. The collection of such indicators is even more relevant when considering an interactive and dynamic optimization environment. This chapter focus on the analysis of the necessary data that should be collected throughout the optimization.

When solving optimization problems we collect both qualitative and quantitative metrics. It only makes sense to evaluate the performance of the optimization algorithms when we are benchmarking several algorithms, for which we need a term of comparison. Therefore, these quality indicators should only be computed after we have the results for the algorithms and not during the optimization. Moreover, since some of these indicators require a reference set, it would be ideal if we could provide them the true Pareto Front, however, this is usually hard to accomplish and, therefore, one idea is to combine the results of all the benchmarked results and try to get a better approximation of the true Pareto Front. This is particularly useful in the context of Architectural Problems, where it is usually impossible to know the shape of the true Pareto Front.

Factor Pros Cons
Evaluation time Allows to estimate the estimated time per analyses and informs the user about the time necessary to complete an analysis. Necessary to maintain an historic of the last evaluations (implies more memory allocation and more time (time complexity is insignificant when compared to analysis, in general)).
If an analysis takes too long and gradually becomes worse, it allows to identify bugs or in the limit to learn something about the complexity of the model (e.g. daylight analysis take longer in more sparse environments with more holes) How many records should it keep? Where should this information appear?
Define which evaluations it considers, should it consider the model-based evaluations as well?
Evaluation Number In terms of the algorithms it is usually the best metric to account for. Usually not linear in time and is not an appropriate measure
Allows to have a pretty good sense of the convergence of different algorithms Define which evaluations it considers, should it consider the model-based evaluations as well?
User knows how many evaluations are left for the algorithm to run Usually only relevant in the context of tests/benchmarks
Best Solution(s) Allows the user to have a real feedback of the best solution found so far How many solutions to present?
Providing a group of the best solutions would be better (as long as they are sufficiently different) Which clustering algorithm to use? (Only applicable in the case of the population, unless we keep track of all the solutions found so far, which implies more memory and more computation time)
Computation time becomes irrelevant in the context of architectural design Should the user be able to conduct the evolution of the optimization process by choosing his own variations?
Very good feedback in the case where the user is able to visualize the variations by interactively selecting the best solutions
Decision Variable Values Important to allow the replication of the results (in the case of verification is needed) Occupies space but it can be stored in disk (assuming there's enough space), which implies read/write overheads when accessing these values.
Constraint violations / Feasibility Important to verify if there are any solutions that violate any constraints and, possibly by how much they violate them. The users might learn rules or detect errors by visualizing the solutions that violate constraints. Occupies space but it can be stored in disk (assuming there's enough space), which implies read/write overheads when accessing these values.
Objective Values Obviously it is important to have the constant mapping between the decision variables' values and the associated objective values Occupies space but it can be stored in disk (assuming there's enough space), which implies read/write overheads when accessing these values.
Average Quality of the Population Depending on the algorithm it is possible to use metrics to measure the performance Which metric to use? Should we maintain a historic (apart from the archive) and maintain an approximation that is incrementally closer to the true Pareto Front?
Allows a finer feedback over the performance of the optimization process run If we plan to integrate with other libraries it might be hard to accomplish
In highly-random-based problems it usually does not provide a great feedback in earlier stages
Average Quality of the Surrogate Model TODO - READ Chapter 3 - Surrogate Models about "SMF and Improvement method" - there are two ways to choose the candidate solutions (the ones that maximize the improvement or the ones which are the best in the current model) Decide which metric to use and how to measure it
PastelBelem8 commented 5 years ago

Other important to have information is: