Generalize OrbitAI - Githubissues

georgeslabreche commented 3 years ago

Overview

Generalize the OrbitAI app so that experimenters can:

Select which training methods to use.
Configure the hyperparameters of the selected training methods.
List which data pool parameters to fetch as training inputs.
Evaluate custom transformation functions.
Define how the target output is calculated.

Proposed Configuration

Here's an example of configuring OrbitAI to train with AROW and NHERD in the 5D input space with 2 data pool parameters and 3 transformation functions.

# Experiment mode
esa.mo.nmf.apps.OrbitAI.mode=train

# Number of iterations the experiment should last
esa.mo.nmf.apps.OrbitAI.iterations=1000

# Time interval between 2 iterations in seconds
esa.mo.nmf.apps.OrbitAI.interval=5

# A Method for Stochastic Optimization.
esa.mo.nmf.apps.OrbitAI.ADAM=0

# Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
esa.mo.nmf.apps.OrbitAI.ADAGRAD_RDA=0
esa.mo.nmf.apps.OrbitAI.ADAGRAD_RDA.hparam.eta=0.1
esa.mo.nmf.apps.OrbitAI.ADAGRAD_RDA.hparam.lambda=0.000001

# Adaptive Regularization of Weight Vectors.
esa.mo.nmf.apps.OrbitAI.AROW=1
esa.mo.nmf.apps.OrbitAI.AROW.hparam.r=0.8

#  Exact Soft Confidence-Weighted Learning.
esa.mo.nmf.apps.OrbitAI.SCW=0
esa.mo.nmf.apps.OrbitAI.SCW.hparam.eta=0.95

# Normal Herd (Learning via Gaussian Herding) with full diagonal covariance.
esa.mo.nmf.apps.OrbitAI.NHERD=1
esa.mo.nmf.apps.OrbitAI.NHERD.hparam.c=0.1
esa.mo.nmf.apps.OrbitAI.NHERD.hparam.diagonal=0

# Passive Aggressive. Three variants: PA, PA-I, PA-II.
esa.mo.nmf.apps.OrbitAI.PA=0
esa.mo.nmf.apps.OrbitAI.PA.hparam.variant=0
esa.mo.nmf.apps.OrbitAI.PA.hparam.c=0.1

# OBSW parameters for which the publishing will be enabled in NMF supervisor
esa.mo.nmf.apps.OrbitAI.inputs=CADC0894,EXP0,EXP1,EXP2,CADC0884

# Check validity flag for CADC0894 (here I am assuming the validity flag data pool parameter for CADC0894 is CADC0895).
esa.mo.nmf.apps.OrbitAI.inputs.CADC0894.validity=CADC0895

# Basic value check for the CADC0894 data pool parameter.
esa.mo.nmf.apps.OrbitAI.inputs.CADC0894.range.min=0
esa.mo.nmf.apps.OrbitAI.inputs.CADC0894.range.max=1.57

# Don't check validity flag for CADC0884 (we don't want to or such a flag does not exist).
esa.mo.nmf.apps.OrbitAI.inputs.CADC0884.validity=0

# Basic value check for the CADC0884 data pool parameter.
esa.mo.nmf.apps.OrbitAI.inputs.CADC0884.range.min=0
esa.mo.nmf.apps.OrbitAI.inputs.CADC0884.range.max=1.57

# Define the EXP0 transformation expression.
esa.mo.nmf.apps.OrbitAI.exp.EXP0=x^2
esa.mo.nmf.apps.OrbitAI.exp.EXP0.vars=x
esa.mo.nmf.apps.OrbitAI.exp.EXP0.var.x=CADC0894
esa.mo.nmf.apps.OrbitAI.exp.EXP0.var.x.validity=0
esa.mo.nmf.apps.OrbitAI.exp.EXP0.var.x.range.min=0
esa.mo.nmf.apps.OrbitAI.exp.EXP0.var.x.range.max=1.57

# Define the EXP1 transformation expression.
esa.mo.nmf.apps.OrbitAI.exp.EXP1=x^3
esa.mo.nmf.apps.OrbitAI.exp.EXP1.vars=x
esa.mo.nmf.apps.OrbitAI.exp.EXP1.var.x=CADC0894
esa.mo.nmf.apps.OrbitAI.exp.EXP1.var.x.validity=0
esa.mo.nmf.apps.OrbitAI.exp.EXP1.var.x.range.min=0
esa.mo.nmf.apps.OrbitAI.exp.EXP1.var.x.range.max=1.57

# Define the EXP2 transformation expression.
# We can also set a variable that's not part of the training inputs.
esa.mo.nmf.apps.OrbitAI.exp.EXP2=sin(x)*cos(y)
esa.mo.nmf.apps.OrbitAI.exp.EXP2.vars=x,y
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.x=CADC0892
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.x.validity=0
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.x.range.min=0
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.x.range.max=1.57
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.y=CADC0894
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.y.validity=0
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.y.range.min=0
esa.mo.nmf.apps.OrbitAI.exp.EXP2.var.y.range.max=1.57

# MOCHI

# Port at which the Mochi server will be listening
esa.mo.nmf.apps.OrbitAI.mochi.port=9999

Notes on the proposed features

1. Select which training methods to use.

Stick with the existing online methods.
Can enable or disable any of the existing methods. E.g. can enable just one of them or all of them.
When using multiple methods we can only train all of them with the same inputs in the same input dimension as configured in the configuration file.

2. Configure the hyperparameters of the selected training methods.

This is pretty self-explanatory.

3. List which data pool parameters to fetch as training inputs.

The list can include both data pool parameter names as well as unique keys that link to custom transformation expressions e.g., EXP0, EXP1, EXP2...

~~### 4. Evaluate custom transformation functions. Use this super duper awesome Java library: https://www.baeldung.com/java-evaluate-math-expression-string~~

5. Define how the target output is calculated.

This is a bit tricky because it can involve conditional logic. Could use a plugin architecture with pf4j: https://github.com/pf4j/pf4j

georgeslabreche commented 3 years ago

I like @TanguySoto's idea of ditching the need for point 4 by making all that logic part of the plugin that would be provided by experimenters (point 5).

georgeslabreche commented 3 years ago

@TanguySoto check this, instead of using an anonymous class for a received aggregation listener like in exp167: https://gitlab.esa.int/OPS-SAT/sepp/exp167/-/blob/39173c40c5776d8acf62ffd0a5c70e125f445102/src/main/java/esa/mo/nmf/apps/Exp167DataHandler.java#L168-180

We can implement file writing logic within the listener, in a dedicated class. Like in exp171: https://github.com/georgeslabreche/opssat-datapool-param-dispatcher/blob/6e7bcb9f1dd8f5329c86c0d61b900ea6cf60b818/src/main/java/esa/mo/nmf/apps/AggregationWriter.java#L102-L157

This eliminates the need for a locking mechanism as well as the need to poll for data. The listener function will take care of all the writing when it is automatically triggered. I am referring to this type of polling in exp167: https://gitlab.esa.int/OPS-SAT/sepp/exp167/-/blob/39173c40c5776d8acf62ffd0a5c70e125f445102/src/main/java/esa/mo/nmf/apps/Exp167SamplingHandler.java#L457

georgeslabreche / opssat-orbitai

Generalize OrbitAI #18

Overview

Proposed Configuration

Notes on the proposed features

1. Select which training methods to use.

2. Configure the hyperparameters of the selected training methods.

3. List which data pool parameters to fetch as training inputs.

5. Define how the target output is calculated.