Open nkakouros opened 6 months ago
edge_index
or edge_indices
. It is more in line with how pytorch geometric names it. It also allows us to add a 'edge_attr' field for when we wish to have features for edges.For example: I am considering doing some offline RL. Then I would save a 'dataset' of observations that can then be used to train in a supervised manner. If the observations contains all the required info, then I can just save a list of observations in a text file and train on it without even importing the mal-toolbox.
I categorically agree with everything you've said. I just require a clarification on one point.
* Cache these lists. Since we do not do DynaMAL, we could have these lists be properties of the attack graph that are set during graph generation. If in the future we want dynamic stuff and we need to dynamically generate these lists again, we can easily do it in python with `@property`.
What do you mean by this? Just adding the cache hint like we have for the observation space? Because the list is not regenerated, but I think you know this already.
What do you mean by this? Just adding the cache hint like we have for the observation space? Because the list is not regenerated, but I think you know this already.
I am not sure I remember. But given that these are already cached, moving these lists to the attack graph is the still relevant part of that comment of mine.
After each simulation step, the first item returned is the observations. A single observation currently contains the following keys:
I suggest we do the following:
Consider renaming some of the last 4 keys.
edge_indices
instead ofedges
to make what is contained clearer.step_name
confusing.step_name
can mean either the qualified step name or not. I think we should always use the distinction qualified/unqualified when talking about such a name.assets
andsteps
change meaning a bit;assets
are instances of MAL assets, andsteps
are the nodes in the attack graph. And maybe this is how other people who work with attack graphs but not with MAL talk about these things. This is a broader discussion about standardizing names and what a term should unequivocally mean, so I will share some examples and thoughts. For example, we could talk aboutasset types
for the MAL asset entities and assets orasset instances
for the objects in the instance model; andattack steps
for the steps in MAL specs andattack nodes
orstep nodes
for the nodes of the attack graph.Remove the
remaining_ttc
. This does not make sense to be disclosed to the agents, it is a simulation thing ("when should the state of an attack node be changed?").Move the functionality to generate the
asset_type
,asset_id
,step_name
lists to theAttackGraph
in the MAL toolbox. These lists are useful for any use of the attack graph to train models, not just for RL. They are also "about" the attack graph and for discoverability and conceptually they belong there.Cache these lists. Since we do not do DynaMAL, we could have these lists be properties of the attack graph that are set during graph generation. If in the future we want dynamic stuff and we need to dynamically generate these lists again, we can easily do it in python with
@property
.Document why stuff like
asset_type
,asset_id
, etc. are passed to the agents through the observations and not e.g. at agent creation (via their__init__
method). It seems passing this info through the observations is common practice and makes re-using 3rd-party agents easier. There are workarounds, but maybe we should be conformant with the common practice.Decide whether we want to be MAL specific or not. Following from the above, should we try to cater for projects/people that do not use MAL? I think not. For instance, if our agents take the graph structure info at creation time while 3rd-party ones do not, it's super fine to do that if we have good reason (e.g. performance). We anyway base the simulation on MAL specs and instance models.