facebookresearch / CompilerGym

Reinforcement learning environments for compiler and program optimization tasks
https://compilergym.ai/
MIT License
885 stars 123 forks source link

Can we identify node_id between current observation and next observation w.r.t Programl? #780

Closed anthony0727 closed 1 year ago

anthony0727 commented 1 year ago

❓ Questions and Help

Observation space is Programl,

Let $G, G'|a$ current observation, next observation given some action

$G$ was 'pruned' by $a$, so that $|G| > |G'|$

some nodes are gone, others remain

Can we map remaining node ids to next observation's node ids?

It seems, looking at Programl-generating part, maybe we should postprocess somehow to match node ids could you give me a little hint for that?

Cheers! Anthony

Additional Context

ChrisCummins commented 1 year ago

Hey @anthony0727! Sorry for the slow reply.

Can we map remaining node ids to next observation's node ids?

Currently the ProGraML graph builder isn't stateful: it takes a module as input, computes the graph, and then forgets it.

https://github.com/ChrisCummins/ProGraML/blob/development/programl/ir/llvm/internal/program_graph_builder.h#L69

To achieve what you'd want you could consider making this graph builder stateful. That is, it would retain the mappings of LLVM object pointers (instructions, arguments etc) to the graph objects. Then when called again on the same module it could compare against the cache and compute a diff.

Probably not a trivial amount of effort to implement and it will be NP complete.

Cheers, Chris

anthony0727 commented 1 year ago

Right, Thanks for the reply!