arup-group / gelato

Gelato turns MATSim outputs into transport sustainability KPI metrics
GNU General Public License v3.0
10 stars 1 forks source link

MATSim & the 24hr day #57

Open gac55 opened 8 months ago

gac55 commented 8 months ago

The non 24hr day in MATSim can break things, for example this.

Often in MATSim we specify a simulated day as being longer than 24hrs.

There are generally two reasons for this:

  1. Activity plans that stretch across midnight. There are agents who work night shift and legitimately work or perform other activities across the 24hr time period. This is a real-world problem.
  2. Utility scoring for bad choices. The MATSim utility function works by adding up the utility of doing things (activities) with the disutility of accessing things (travel). One challenge this utility function can face is bad decisions not becoming obvious until past midnight. If an agent walks or takes a bus that doesn't turn up, they may find they get stuck or simply don't make it to their destination for many hours. In this case, there is a risk that at 24hrs the agent still hasn't realised how bad the plan is, in order to sufficiently negatively score it. We can handle this by extending the day to 28 or 30hrs, increasing the probability of an agent realising how late they are and sufficiently negatively scoring this plan to make sure it doesn't keep it or try it again. This is a modelling problem.

In 1, there is a question on whether or not these agents are significant enough for us to main their "actual" plans and if there is really a need to model them cutting across midnight. Often we can argue these don't matter enough for our use cases and can remove/ignore these parts of an agents plan. If we are working on airport use cases, for example, we may not be able to make this argument as these agents may be deemed important.

In 2, this is challenging and not easily solved.

One thing Gelato needs to do is support a variety of MATSim model inputs. We may see 24hr, 28hr and even 30hr. How should we handle this?

gac55 commented 8 months ago

Wrapping time:

We can wrap 1am from day 2 into day 1. In this argument we are leaning on the fact we are modelling an average day and that it is better to account for this subsequent demand in our initial day.

KasiaKoz commented 8 months ago

1 - what I don't really get is: if people work over night, why do their plans not look like this: 00:00:00 - 06:00:00 - work 06:00:00 - 20:00:00 - home 20:00:00 - 23:59:59 - work ? why do they need to be going over 24:00:00? i.e. why can't we wrap their plan ?

gac55 commented 8 months ago

1 - what I don't really get is: if people work over night, why do their plans not look like this: 00:00:00 - 06:00:00 - work 06:00:00 - 20:00:00 - home 20:00:00 - 23:59:59 - work ? why do they need to be going over 24:00:00? i.e. why can't we wrap their plan ?

@panostsolerid @Theodore-Chatziioannou are best placed to give a view on this from an AcBM/plans perspective

Theodore-Chatziioannou commented 8 months ago

you are talking about output (post-MATSim) plans, right?

gac55 commented 8 months ago

you are talking about output (post-MATSim) plans, right?

We are discussing both. A very large divergence from initial plans can be a sign of poor calibration in my mind (but then, there are many scenarios where it may be legitimate)

KasiaKoz commented 8 months ago

I was talking about input plans

Theodore-Chatziioannou commented 8 months ago

In our input plans we tend to crop to 24 hours, but during the simulation some plans may overflow (for example, when you have stuck agents or late trip that ended up being quite long). It doesn't necessarily break things; post-processing should have logic in place that is able to handle this, so that you don't end up with missing trips/legs.

KasiaKoz commented 8 months ago

ok, so this statement is not something we need to worry about:

  1. Activity plans that stretch across midnight. There are agents who work night shift and legitimately work or perform other activities across the 24hr time period. This is a real-world problem.

this real world behaviour is already handled, everyone gets a plan that is 24hrs and the problem is that simulation may elongate some plans based on circumstances.

Theodore-Chatziioannou commented 8 months ago

usually not with our (CML) models, but of course it depends on how one decides to build the demand. (We have discussed about having longer-than-24hr plans for specific scenarios, and we may try it at some point in the future.)

I haven't seen the Ile de France model though, so you may have to check its assumptions.

gac55 commented 8 months ago

My question was general, not just about our models or idf. I'd like us to support the wide range of possible inputs we'd get, if reasonable and practical.

That's why I was thinking about the fundamentals of the problem. not just the implementation. On balance, feels like wrapping is just fine and not something to be too afraid off. We should gather more experience with more models in future work and perhaps we form a clearer view then

gac55 commented 8 months ago

ok, so this statement is not something we need to worry about:

  1. Activity plans that stretch across midnight. There are agents who work night shift and legitimately work or perform other activities across the 24hr time period. This is a real-world problem.

this real world behaviour is already handled, everyone gets a plan that is 24hrs and the problem is that simulation may elongate some plans based on circumstances.

Yes, so we simply need a section in the docs/read me that explains time wrapping and advises care if you have plans which are intentionally operating over 24hrs

Theodore-Chatziioannou commented 8 months ago

ok, if this a more abstract question, I can see two use cases: a) multi-day temporal scope: ie a simulation representing agents' activities across a week. In this case, a user would probably want to output results for each day separately (or have a day field in the outputs). b) the model simply adds some buffer time in the end to allow agents to complete their plan (the case we usually have). Day-wrapping sounds fine here.

KasiaKoz commented 8 months ago

Could you explain a bit more why >24 hrs breaks things? Looking around I still don't really understand the problem.

The non 24hr day in MATSim can break things, for example https://github.com/arup-group/gelato/issues/24.

This issue is requesting fewer columns in the output, for a sim that was configured to run for the number of hours that appear in the output, it didn't actually break anything (there are other problems mentioned in the issue that aren't related to 24hrs however)

post-processing should have logic in place that is able to handle this, so that you don't end up with missing trips/legs.

we don't have missing trips/legs, we read it all at face value

steffenaxer commented 8 months ago

Imho it is mainly caused by the mobsim/qsim. The eventManager required an monotonically increasing time. Otherwise we will see "events are not in a proper order" error msg. So that is the reason why it is problematic to wrap on the input side.

steffenaxer commented 8 months ago

Maybe one need a more complex time handling in MATSim, that maybe can interfaced. Something like a mobsimTim and an analyticalTime for outputs and visualization. However, working with different times is pain. I have even trouble when working with timezones 😰

On the one side one needs to ensure that the behaviour of the simulation remains at it is, with a monotonically increasing time. But output may have an additional analytical time, that might be e.g. a wrapped 24 h time.

steffenaxer commented 8 months ago

Could you explain a bit more why >24 hrs breaks things? Looking around I still don't really understand the problem.

The non 24hr day in MATSim can break things, for example #24.

This issue is requesting fewer columns in the output, for a sim that was configured to run for the number of hours that appear in the output, it didn't actually break anything (there are other problems mentioned in the issue that aren't related to 24hrs however)

post-processing should have logic in place that is able to handle this, so that you don't end up with missing trips/legs.

we don't have missing trips/legs, we read it all at face value

I even do not understand why > 24h is breaking something. In my eyes it is just strange for users, that we might have 30h.