Different output when using a label or the underlying expression

jjguijt commented 5 years ago

When using storm-pomdp to create a pMC from a POMDP I get different results when I'm using a labelled expression than when I'm just using the expression. To the best of my knowledge, both outputs should be equal.

The label is defined as label "crash" = a=s1;

Command 1: storm-pomdp --prism shields-warehouse/storm-pomdp-model.prism --parametric-drn shields-warehouse/storm-pmdp-model.drn --prop "Pmax=? [ (! \"crash\") U a = 21 ]"

Storm-pomdp 1.3.1 (dev)

Date: Thu Jul 18 14:14:41 2019
Command line arguments: --prism shields-warehouse/storm-pomdp-model.prism --parametric-drn shields-warehouse/storm-pmdp-model.drn --prop 'Pmax=? [ (! "crash") U a = 21 ]'
Current working directory: /home/jeremy/study/internship

Time for model input parsing: 0.210s.

Time for model construction: 0.141s.

-------------------------------------------------------------- 
Model type:     POMDP (sparse)
States:     441
Transitions:    3481
Choices:    2541
Observations:   21
Reward Models:  none
State Labels:   4 labels
   * deadlock -> 0 item(s)
   * init -> 1 item(s)
   * (a = 21) -> 21 item(s)
   * crash -> 21 item(s)
Choice Labels:  none
-------------------------------------------------------------- 
Analyzing property 'Pmax=? [!("crash") U (a = 21)]'
Assumming memoryless schedulers.
Transforming memoryless POMDP to pMC... done.
-------------------------------------------------------------- 
Model type:     DTMC (sparse)
States:     441
Transitions:    2367
Reward Models:  none
State Labels:   4 labels
   * deadlock -> 0 item(s)
   * init -> 1 item(s)
   * (a = 21) -> 21 item(s)
   * crash -> 21 item(s)
Choice Labels:  none
-------------------------------------------------------------- 
Simplifying pMC... done.
-------------------------------------------------------------- 
Model type:     DTMC (sparse)
States:     402
Transitions:    2192
Reward Models:  none
State Labels:   3 labels
   * init -> 1 item(s)
   * (a = 21) -> 1 item(s)
   * crash -> 1 item(s)
Choice Labels:  none
-------------------------------------------------------------- 
Exporting pMC...Write to file shields-warehouse/storm-pmdp-model.drn.
 done.

Command 2: storm-pomdp --prism shields-warehouse/storm-pomdp-model.prism --parametric-drn shields-warehouse/storm-pmdp-model.drn --prop "Pmax=? [ (! (a = s1)) U a = 21 ]"

Storm-pomdp 1.3.1 (dev)

Date: Thu Jul 18 14:16:51 2019
Command line arguments: --prism shields-warehouse/storm-pomdp-model.prism --parametric-drn shields-warehouse/storm-pmdp-model.drn --prop 'Pmax=? [ (! (a = s1)) U a = 21 ]'
Current working directory: /home/jeremy/study/internship

Time for model input parsing: 0.206s.

Time for model construction: 0.134s.

-------------------------------------------------------------- 
Model type:     POMDP (sparse)
States:     441
Transitions:    3336
Choices:    2441
Observations:   21
Reward Models:  none
State Labels:   4 labels
   * deadlock -> 0 item(s)
   * !((a = s1)) -> 420 item(s)
   * init -> 1 item(s)
   * (a = 21) -> 21 item(s)
Choice Labels:  none
-------------------------------------------------------------- 
Analyzing property 'Pmax=? [!((a = s1)) U (a = 21)]'
Assumming memoryless schedulers.
Transforming memoryless POMDP to pMC...ERROR (ApplyFiniteSchedulerToPomdp.cpp:64): Number of choices must be equal for every state with same number of actions
storm-pomdp: /home/jeremy/study/internship/storm/src/storm-pomdp/transformer/ApplyFiniteSchedulerToPomdp.cpp:64: std::unordered_map<unsigned int, std::vector<carl::RationalFunction<carl::FactorizedPolynomial<carl::MultivariatePolynomial<cln::cl_RA> >, true> > > storm::transformer::ApplyFiniteSchedulerToPomdp<ValueType>::getObservationChoiceWeights(storm::transformer::PomdpFscApplicationMode) const [with ValueType = __gmp_expr<__mpq_struct [1], __mpq_struct [1]>]: Assertion `it == res.end() || it->second.size() == pomdp.getNumberOfChoices(state)' failed.
Aborted (core dumped)

You can find the POMDP file here.

sjunges commented 5 years ago

Hi,

thanks for the bug report. I can confirm that this is unexpected behaviour.

Let me quickly explain what happens: The first MDP is a bit larger, as the model builder is a bit dumber in eliminating actions during exploration. That is, because the labelling is handled only after model building. That is suboptimal, but far from trivial to prevent.

Now, the second MDP is smaller, but by this (partial) pruning violates the important assumption that states with the same observation have the same number of actions---I guess this happens in states from which the target is reached with probability zero anyways. This shortcoming needs to be fixed, but requires some additional information that is not readily present.

In conclusion, the second MDP is a bit smaller by doing some "smart" things, which are not handled well in combination with the later processing of the POMDP. Meanwhile, you may disable this preprocessing using --buildfull . That fixes the error in this instance.

Best, Sebastian

jjguijt commented 5 years ago

Thanks for the quick reply!

AlexBork commented 2 months ago

I had a deeper look at the issue in the current version. The model as provided is not a valid POMDP model for Storm as transitions without label are not allowed for states with the same observation if they originate from different commands in the Prism model. Thus, error

Observation 12: actions: {[0 (), 6] [1 (n), 1] [2 (s), 2] [3 (e), 3] [4 (w), 4] [5 (stop), 5]} according to state 0.
Observation 12: actions: {[0 (), 17] [1 (n), 1] [2 (s), 2] [3 (e), 3] [4 (w), 4] [5 (stop), 5]} according to state 3.
ERROR (MakePOMDPCanonic.cpp:284): Actions identifiers do not align between states 
    3
and
    0
both having observation 12. See output above for more information.

is thrown, indicating a modelling issue.

However, if I label the commands with a new name, the model fits our requirements and I can reproduce the difference described. In fact, command storm-pomdp --prism storm-pomdp-model.prism --parametric-drn storm-pomdp-model.drn --prop "Pmax=? [ (! (a = s1)) U a = 21 ]" with the modified model throws error

ERROR (MakePOMDPCanonic.cpp:197): Observation 20 sometimes provides multiple actions, but in state 57 provides only one action [[no-label]].

while command storm-pomdp --prism storm-pomdp-model.prism --parametric-drn storm-pomdp-model.drn --prop "Pmax=? [ (! \"crash\") U a = 21 ]" runs without error.

The workaround using --buildfullstill works.

sjunges commented 2 months ago

I dont understand the first error message (even though I wrote that code, I believe).

The latter problem seems to be with making states absorbing in the model construction, ignoring that this operation for POMDPs is more complicated?

AlexBork commented 2 months ago

For the first error message: we require that for transitions without label to occur in states with the same observation, they need to originate from the same Prism command. In the error message, [0 (), 6] for state 0indicates that the 0-th action of the state is an action labeled "" originating from command 6, whereas for state 3, [0 (), 17] indicates that the ""-action originates from command 17.

Generally, if we have an observable variable o, an unobservable s, commands [] s=1 -> ... and [] s=2 -> ..., and states s=1 & o=0 and s=2 & o=0, we throw an error because the unlabeled transitions are created by two different commands. I think this has something to do with ensuring correct/expected behavior in compositions, but don't quote me on that.

The error message is really not helpful, I have already expanded it a bit while debugging and will push that extension soon :)

The latter problem is indeed a problem with making states absorbing. I have constructed a small example where this already fails. I guess a fix is to add self-loops for all actions of the original state. I can have a look how feasible this is.

AlexBork commented 2 months ago

I have opened a new issue that focuses on the problem with absorbing states in the current version. I suggest continuing the discussion there and closing this issue.

moves-rwth / storm

Different output when using a label or the underlying expression #48