Use COO format for edge index

mal-lang / mal-simulator

Apache License 2.0

2 stars 1 forks source link

Use COO format for edge index #16

Open nkakouros opened 7 months ago

nkakouros commented 7 months ago

Currently, the edge indices as provided by observations['edges'] are a list of tuples with each tuple representing a single edge. However, ML frameworks like pytorch and tensorflow expect the edge indices in COO (coordinate) format. This format is a list of 2 items. Each item is a list containing the starting node's index for each edge. The other list contains the target node. For example:

[
    [0, 0, 1, 2]
    [1, 2, 2, 3]
]

for a graph with 4 nodes and 4 edges (0-1, 0-2, 1-2, 2-3).

This PR converts the observations['edges'] list to the coo format.

kasanari commented 7 months ago

Why not just call numpy.transpose? That is what I do in the wrapper (or model, i forget) to make the edges compatible with pyg. I have nothing against moving it into the obs func, though. Having the list of tuples is IMO a bit easier to iterate over before converting to numpy.

nkakouros commented 7 months ago

Having the list of tuples is IMO a bit easier to iterate over before converting to numpy.

When do you need to iterate over? Also, isn't the edge list in the observations returned by the simulator already a numpy array?

Why not just call numpy.transpose?

That's what I do, I think it should not be needed though. Eventually, the edge list should be accessible via the AttackGraph itself, it is about the graph anyway. There, the edge list could be stored as a list of tuples and we could also have some helper method/property to read it in coo format, or the other way around. But this here is the simulator that is supposed to be used with ML frameworks; I don't think we should make it more complicated than what is needed to work with it.

kasanari commented 7 months ago

When do you need to iterate over?

It happens once in the changed code. Where instead of "for edge in edges" it iterates over a range of indices. Does not really matter but I find the old version cleaner.

kasanari commented 7 months ago

I illustrate how this can be done with np.transpose here

nkakouros commented 7 months ago

I illustrate how this can be done with np.transpose here

Nice, that's simpler. Although I still believe the simulator should expose things like edge indices in the "expected" format, this is a low-priority issue and can be better implemented. I am leaving open as a reminder for discussion.