Streamline observations and define terms and scope

nkakouros commented 6 months ago

After each simulation step, the first item returned is the observations. A single observation currently contains the following keys:

dict_keys(['is_observable', 'observed_state', 'remaining_ttc', 'asset_type', 'asset_id', 'step_name', 'edges'])

I suggest we do the following:

Consider renaming some of the last 4 keys.
- First, we could use edge_indices instead of edges to make what is contained clearer.
- I also find step_name confusing. step_name can mean either the qualified step name or not. I think we should always use the distinction qualified/unqualified when talking about such a name.
- I think the unqualified name makes sense to us because we think in a MAL way (of asset types having steps). When we look at the attacked graph and the instance model, both assets and steps change meaning a bit; assets are instances of MAL assets, and steps are the nodes in the attack graph. And maybe this is how other people who work with attack graphs but not with MAL talk about these things. This is a broader discussion about standardizing names and what a term should unequivocally mean, so I will share some examples and thoughts. For example, we could talk about asset types for the MAL asset entities and assets or asset instances for the objects in the instance model; and attack steps for the steps in MAL specs and attack nodes or step nodes for the nodes of the attack graph.
Remove the remaining_ttc. This does not make sense to be disclosed to the agents, it is a simulation thing ("when should the state of an attack node be changed?").
Move the functionality to generate the asset_type, asset_id, step_name lists to the AttackGraph in the MAL toolbox. These lists are useful for any use of the attack graph to train models, not just for RL. They are also "about" the attack graph and for discoverability and conceptually they belong there.
Cache these lists. Since we do not do DynaMAL, we could have these lists be properties of the attack graph that are set during graph generation. If in the future we want dynamic stuff and we need to dynamically generate these lists again, we can easily do it in python with @property.
Document why stuff like asset_type, asset_id, etc. are passed to the agents through the observations and not e.g. at agent creation (via their __init__ method). It seems passing this info through the observations is common practice and makes re-using 3rd-party agents easier. There are workarounds, but maybe we should be conformant with the common practice.
Decide whether we want to be MAL specific or not. Following from the above, should we try to cater for projects/people that do not use MAL? I think not. For instance, if our agents take the graph structure info at creation time while 3rd-party ones do not, it's super fine to do that if we have good reason (e.g. performance). We anyway base the simulation on MAL specs and instance models.

kasanari commented 6 months ago

I agree with renaming edges to either edge_index or edge_indices. It is more in line with how pytorch geometric names it. It also allows us to add a 'edge_attr' field for when we wish to have features for edges.
Remaining TTC could be removed for now, but one might argue that it is relevant for an attacker agent if we are working in a fully observable scenario. I used to have a search-based attacker that used the remaining ttc as a path cost, for example.
Regarding static information in the obs: To me this is only an issue if memory usage becomes too high. I agree that caching heavy calculations on the backend would be good to save CPU cycles, but adding the asset types to the obs every iteration is not a costly operation from that perspective. Not including it only makes the system less 'pure' from an MDP or programming perspective, because all information that is being used to make decisions (and advance the system to the next state) is not included in the state/observation representation.

kasanari commented 6 months ago

For example: I am considering doing some offline RL. Then I would save a 'dataset' of observations that can then be used to train in a supervised manner. If the observations contains all the required info, then I can just save a list of observations in a text file and train on it without even importing the mal-toolbox.

andrewbwm commented 6 months ago

I categorically agree with everything you've said. I just require a clarification on one point.

* Cache these lists. Since we do not do DynaMAL, we could have these lists be properties of the attack graph that are set during graph generation. If in the future we want dynamic stuff and we need to dynamically generate these lists again, we can easily do it in python with `@property`.

What do you mean by this? Just adding the cache hint like we have for the observation space? Because the list is not regenerated, but I think you know this already.

nkakouros commented 4 months ago

What do you mean by this? Just adding the cache hint like we have for the observation space? Because the list is not regenerated, but I think you know this already.

I am not sure I remember. But given that these are already cached, moving these lists to the attack graph is the still relevant part of that comment of mine.

mal-lang / mal-simulator

Streamline observations and define terms and scope #14