mlegas / FSM-reinforcement-learning

Deep RL framework for exploring state spaces of finite state machines.
2 stars 0 forks source link

Could't find the specific meaning ofS0,S1,S2,S3,S4.... #1

Closed zhengkemei closed 1 year ago

zhengkemei commented 1 year ago

Hello author, I am glad that you can share your results on github. Your paper is very good, but I encountered some problems in the process of reading: 1.What exactly do S0,S1,S2,S3,S4,S5 in 4.9.1 Pelican crossing refer to? 2.What exactly do S0,S1,S2,S3,S4,S5,S6,S7 in 4.9.2 Toaster device refer to? Looking forward to your reply. Thank you!

mlegas commented 1 year ago

Hi @zhengkemei,

Thank you for your opinion, I'm glad my work has found some further use.

To answer your questions:

  1. The states in my pelican crossing environment refer to the more advanced states of a pelican crossing (https://en.wikipedia.org/wiki/Pelican_crossing) made by Ben Lloyd-Roberts in his thesis: obraz

Therefore, S0, S1, S2, S3, S4, and S5 are abstracted states that are shown on the picture above.

Citing directly from his work:

_tl01 and _tl02 represent traffic lights responsible for vehicle signalling. Similarly _pl01 and _pl02 represent pedestrian lights responsible for controlling foot traffic. Finally, _pb01 and _pb02 represent physical buttons which issue a pedestrian crossing request when pressed. In order to model the functionality of this pelican crossing we must introduce a set of latches to use in our ladder logic program. To reduce the size of our example program, our pelican crossing model removes the transitional amber aspect from both traffic lights. Therefore each latch {_tl_01, tl_02, pl_01, pl02} have two possible states displaying green or red aspects, respectively. Given that both pedestrian buttons issue the same crossing request, we amalgamate inputs from both _pb_01, pb02 under the latch pressed. Pelican crossings also emit an audio signal to indicate safe crossing conditions for visual impaired pedestrians. Thus we introduce the latch audio.

req is a state of whether crossing the street has been requested.

The following image gives a visual representation of what do the latches mean: obraz

Reference: B. Lloyd-Roberts, “Applying reinforcement learning to railway interlocking verification,” Master’s thesis, Swansea University, Sept 2020.

Unfortunately, I am unable to share his thesis publicly in full. However, you might wish to contact Mr Ben Lloyd-Roberts directly for any further details via e-mail: ben.lloyd-roberts@swansea.ac.uk.

  1. Citing from J. Zander, I. Schieferdecker, and P. J. Mosterman, Model-based testing for embedded systems. CRC Press, 2012:

    The toaster can be started by pushing (push) down the latch. As a reaction, the heater is turned on (on). The toaster stops toasting (off) after a certain time (time) or after the stop button (stop) has been pressed. Furthermore, the toaster has two heating power levels, one of which can be selected by increasing (inc) or decreasing (dec) the heating temperature. The toaster also has a defrost function (defrost) that results in an additional defrosting time (time d) of frozen toast. Note that time is used in our modeling only in a qualitative way, that is, quantitative aspects of timing are not taken into account. This simple machine consists of two groups of states: s0 . . . s3 for regular heating and s4 . . . s7 for heating with increased heating power. From the first group of states, the machine accepts an increase of the heating level, which brings it into the appropriate high-power state; vice versa, from this state, it can be brought back by decreasing the heating level. Thus, in this machine only two heating levels are modeled. It is obvious how the model could be extended for three or more such levels. However, with a growing number of levels the diagram would quickly become illegible.

To add to that, S0 is a start state when the toaster is not heating anything in the regular heating mode (i.e. we just plugged it to power but have not pressed anything), and S6 is a state when the toaster is not heating anything in the increased heating power mode (i.e. we increased the power to the increased heating power mode, but the toaster is not heating anything - it is still in "standby", or rather "off" mode).

All the best to your research.