dbosk / spores

SPORES: Stateless Probabilistic Onion Routing for E-Squads
0 stars 0 forks source link

The behavioral model cannot be infered from the sequence of online devices #21

Closed Adrien-Luxey closed 4 years ago

Adrien-Luxey commented 6 years ago

At least not using HMM usual parameters inference algorithms, namely Baum-Welch.

We were hoping to infer a HMM model of the user's behavior from the sequence of the devices connections. This would have allowed each device to publish predictions of its state at t+i.

A failed model fit

dbosk commented 6 years ago

On Mon 09 Apr 2018 15:42:22 GMT, Adrien Luxey wrote:

At least not using HMM usual parameters inference algorithms, namely Baum-Welch.

We were hoping to infer a HMM model of the user's behavior from the sequence of the devices connections. This would have allowed each device to publish predictions of its state at t+i.

How is this different from Sprinkler? And why did it work there? Or didn't it?

  • Downside: We need to take the fact that devices know a model of the user's behavior as a hypothesis;

Sure, we can assume that. As we said during today's meeting: we can assume this, design our algorithm around it and once we have better predictions our system will perform better.

To simulate this we can make the user behaviour available to the devices, then the device can just look to the future in the data. This would give us the optimal performance. Then we can evaluate against that.

  • Upside: We can publish the devices states at time t (now), and still consider that the user's behavior is fairly unknown.

I wouldn't dare to make this assumption. Just because one attempt at modelling the user behaviour from this data fails, doesn't mean the behaviour cannot be learned --- we're just using the wrong method.

Adrien-Luxey commented 6 years ago

How is this different from Sprinkler? And why did it work there? Or didn't it?

Our previous model was just a Markov model. Hidden Markov Models (HMMs) are a conjunction of two Markov models.

I wouldn't dare to make this assumption. Just because one attempt at modelling the user behaviour from this data fails, doesn't mean the behaviour cannot be learned --- we're just using the wrong method.

Only devices online at time t publish the probability of being online at time t+1. We could use only the prediction, but the fact it's in the DHT means that the device is online at t.

I think we will use this crappy inference for now, as long at it provides decent predictions.

dbosk commented 6 years ago

On Sat 14 Apr 2018 08:27:08 GMT, Adrien Luxey wrote:

I wouldn't dare to make this assumption. Just because one attempt at modelling the user behaviour from this data fails, doesn't mean the behaviour cannot be learned --- we're just using the wrong method.

Only devices online at time t publish the probability of being online at time t+1. We could use only the prediction, but the fact it's in the DHT means that the device is online at t.

With this approach I can just read in the DHT when a device is online and offline. Sounds like enough to build a statistical model of the device's behaviour. Specifically if the addresses are static.

dbosk commented 6 years ago

What is the state of the title of this issue? We cannot infer behaviour from the observations, or can we? It sounds like it's possible in the paper.

dbosk commented 6 years ago

Is this still an issue after changing to a not-hidden Markov model?

Adrien-Luxey commented 6 years ago

The issue still holds: after discussion, we understood that the sequence of the devices' connections was not enough to infer the HMM of the user's behavior. We would need more info (the user's location, or each device's probability of being online at each of the user's locations).

Our solution is to estimate a device's probability of staying online without the user's HMM. We will come back to that in the paper.