Open sim642 opened 2 years ago
One idea @vesalvojdani had was to simply compare their locations and nothing else (i.e. to ignore the bodies entirely). If an inline function is included into two files during preprocessing that are now being merged, that inline function came from the same location in the headers. Therefore if the locations are equal, we should reasonably assume that so are their bodies. The reasonable assumption of course being that the header files don't change between the preprocessing of different files.
Of course such heuristic might miss some pairs of inline functions that the printout approach currently merges, but I would assume such cases are very unlikely. Why would there be two identical implementation of an inline functions be in different locations in the headers? It's more reasonable to assume that most of the duplication is simply from the same header being preprocessed into everything.
In #72 we disabled the merging of inline functions. One of the reasons being a slowdown claimed by an old comment: https://github.com/goblint/cil/blob/c47bbbdb442b4d50dac5a0879404ab3b5592170a/src/mergecil.ml#L65-L68
It would be useful to figure out the root of that slowdown and possibly optimize it, because as it turns out, not merging inline functions can lead to large number of identical functions, which in themselves cause slowdown due to their sheer number.
Just by looking at the little code there is for them, this gem caught my eye: https://github.com/goblint/cil/blob/c47bbbdb442b4d50dac5a0879404ab3b5592170a/src/mergecil.ml#L1459-L1495 The merging happens based on the
Pretty.doc
printout of the function. I haven't profiled this, but that seems incredibly stupid. It's almost like comparing their string representations (which in fact might actually be faster since strings are continuous memory instead of thedoc
trees that are compared polymorphically).