bubakazouba / autoauto

1 stars 0 forks source link

Scoring based on an optimizer #26

Closed asultan123 closed 3 years ago

asultan123 commented 3 years ago

Motivation: Throughout the development of this project, we will have a plethora of models and metrics we will use to score patterns the user is attempting to repeat. The cycle suggestion is decoupled from any underlying model/ metric. Here a model is assumed to mean a blob of code that can score suggested cycles probabilistically or through some well-defined scoring mechanism. Metric here means some properties of suggested cycles found by models that aren't tied to the scoring mechanism of a model. E.G suggested cycle length/ order/ position in the cycle hierarchy (is it a cycle that has mini-cycles that have mini-cycles or just a sequence that's repeated with very little sub repetition). If the input of the user is assumed to have either 1) some degree of inaccuracy due to user error 2) some degree of fuzziness due to context-specific actions/ DOM states, then our models can never be 100% accurate. To assess the performance of our models/ metrics we need some cost function (1). Additionally, using a singular model/metric without any aggregation wouldn't make sense. To get around this we can create some generic expression called the aggregate scoring function that takes into account outputs of whatever models/ models we think are promising. The scoring function can be used to evaluate the aggregate score of suggested cycles made by each model. This aggregate scoring function has to be flawed. Not only that, if some underlying models have output scores that depend on some internal tunable parameters those models will also be flawed. Let's assume the case where the aggregate scoring function is just a linear expression that combines normalized metric/ model outputs into one score value. The constants in that expression are tunable parameters.

Suggested high-level algorithm: Tune an aggregate polynomial scoring function that takes in different normalized metrics/ model scores and multiplies them by a modifiable constant. This will improve the scoring function and allows us to gather data on which metrics/ models consistently suggest cycles that are close to the target.

Scoping: The following functions are needed: An aggregate scoring function that takes a single suggested cycle's metrics/ model scores and a set of constants. A cycle_eval function that runs the scoring function on each suggested cycle's metrics/model scores. A cycle rank function that takes all suggested cycles runs cycle_eval on them and gets the top n cycles based on the aggregate score. A loss function that takes arbitrary cycles and evaluates the edit distance between them and some target. Finally, an objective function that takes suggested cycles, runs cycle_rank on them, gets top-n cycles, evaluates the loss on them, and returns the average loss of the top-n suggested cycles. Optuna can then take the objective function and minimize loss by manipulating scoring function constants. These constants ultimately are weights that represent the importance of some models/ weights. Generally, types of tunable parameters will be limited to the scoring function. In the future, model hyper parameters can be included as part of the overall optimization loop. Optimization will be limited to offline only (no optimizer in the loop) and will be run on unit tests available.