Implement Aggregations Algorithm

VenturaDelMonte commented 7 years ago

Link to original paper: http://eprints.bournemouth.ac.uk/24798/

JamilWaqas commented 7 years ago

The code related to the paper can be found on the following link: https://github.com/waqasjamyl/Rcode

VenturaDelMonte commented 7 years ago

We went through your paper and source code. The algorithm is easy to understand and its implementation is quite straightforward. However, there could be a bunch of issues, considering the asynchronous nature of our data streams. The way we see it, each expert produces predictions at different speed and also true outcomes arrive with some delay. You find an example in this picture. Conversely, the AA algorithm proposed in your paper expects to receive all the expert predictions and the true outcome at the same time. Therefore, we propose to implement the algorithm in this way:

each time we get a new expert prediction, we output it if the weight of that expert is the max
each time we get a true outcome, we update the weights using the true outcome all the predictions we received since the last weight update.

What do you think about that @waqasjamyl ? Please, feel free to comment and correct us. Please note that this twist is not introduced by the usage of any dataflow processing engine, but it is due to the streams' asynchronousness.

JamilWaqas commented 7 years ago

I don't understand what you mean by the following

each time we get a new expert prediction, we output it if the weight of that expert is the max

Since, each expert is given the same input, thus, if true outcome for say time t comes after t + 1 observations. It is fine because they all will now receive tth observation after they process the observation available.

Please note that all the experts are given true outcome at time t, they predict for time t+1. AA merges all the experts prediction at t and predicts true outcome for time t +1 based on their weights.

VenturaDelMonte commented 7 years ago

Sorry, I meant each time we get a new expert prediction, we output the result of the substitution function calculated by using the current weights. However, I could not understand if the substitution function you propose can actually work with partial expert predictions. Furthermore, your R code assumes to know in advance the min and max of the outcomes, which is a bit unfeasible in a normal streaming scenario, therefore, I wonder whether we could incrementally calculate both min and max.

alirezarm commented 7 years ago

Let E_1,i, E_2,i, .., E_n,i be n experts in the model and let their corresponding outcome be O_i. In a sensor streaming environment, we will most probably receive each E_j,i in different time. Instance i (you can imagine it as a row in expert table) is completed when we receive all E_j,i and the corresponding O_i. Upon receiving O_i, we can updates weights for all experts using substitution function.

Now imagine we run the algorithm and receive E_j,1 (1<=j<=n) (expert predictions of the first instance). The only prediction that we can make so far is the uniform weight initialization. We can update the weights upon receiving O₁. Now imagine we start receiving E_j,2. Before receiving O₂, the best prediction per expert that we can make is the one from previous step (what we get by solving the substitution function on instance 1). After receiving O₂, the weights will be updated and so on.

JamilWaqas commented 7 years ago

Yes min and max is an issue, but in some cases data is strictly between 0 and 1. Also, in some cases user already knows what is the possible maximum and the minimum. One can specify the min and max value approximately, however, results may not be as good as knowing the exact values.

Algorithm will not work with partial expert predictions. Imagine if all experts are algorithms which take same time to calculate predictions and they receive the same input. Will still asynchrony be an issue?

I would just like to add the following:

Our expert is making prediction using the true outcomes as well. Imaging a set of ARMA(1,1),ARMA(2,1),...,ARMA(n,m) as our experts. Each of our expert process true outcomes at time t and predicts t+1th observation, then AA merges all their predictions according to their weights, which is in some cases better than doing average i.e. one could say that average of N experts at time t is the prediction at time t+1, but one may instead use AA. Sometimes merging experts results are even better than then the best expert. Also, it is not possible to know the order of the ARMA in advance, thus, AA solves that problem.

alirezarm commented 7 years ago

@waqasjamyl : what if we build min and max of outcomesMax incrementally, i.e., each time we receive a new outcome we compare it with current min and max of the outcomes we have seen so far and we might need update them. This is in accordance with intrinsic of data stream meaning we cannot possibly know result of a computation for the whole stream before actually seeing all of it. As you mentioned one can make a guess but the guess should be backed up with domain knowledge which we cannot assume is always available beforehand.

JamilWaqas commented 7 years ago

The problem is the algorithm uses min and max in the substitution function and to know the optimal value of the learning rate eta. If I understand correctly, what you are suggesting will require two passes. One pass will be made to calculate the min and max, and then second pass will be made to use it.

alirezarm commented 7 years ago

@waqasjamyl : No we do not need two passes. We start min and max with some default values. Each time we get a new outcome we recompute min and max.

JamilWaqas commented 7 years ago

Min and max can be calculated in one pass, but the algorithm requires eta before it does any calculation. For example optimal eta is equal to (2/(max(outcomes)-min(outcomes))). Predictions (gamma) at each step from time 1 to time t use this value. Eta is a fixed optimal value if you change it at each step, results maybe not good at all.

alirezarm commented 7 years ago

@waqas: You are well aware that making any general assumptions about data streams which is not arrived yet can also lead to inconsistent or wrong results. For now, we should see if we can get such values for our default usecase.

proteus-h2020 / proteus-solma

Implement Aggregations Algorithm #5