goblint / analyzer

Static analysis framework for C
https://goblint.in.tum.de
MIT License
160 stars 72 forks source link

Dubious code duplication in `destabilize_vs` #1433

Open Red-Panda64 opened 2 months ago

Red-Panda64 commented 2 months ago

The destabilize_vs function is an almost verbatim copy of destabilize_normal, except that it is supposed to determine whether any called or start variables were destabilized. This is needed for the side_widen option "cycle".

(* Same as destabilize, but returns true if it destabilized a called var, or a var in vs which was stable. *)
let rec destabilize_vs x = (* TODO remove? Only used for side_widen cycle. *)
  if tracing then trace "sol2" "destabilize_vs %a" S.Var.pretty_trace x;
  let w = HM.find_default infl x VS.empty in
  HM.replace infl x VS.empty;
  VS.fold (fun y b ->
      let was_stable = HM.mem stable y in
      HM.remove stable y;
      HM.remove superstable y;
      HM.mem called y || destabilize_vs y || b || was_stable && List.mem_cmp S.Var.compare y vs
    ) w false

For comparison, here is the normally used destabilize_normal function:

let rec destabilize_normal x =
  if tracing then trace "sol2" "destabilize %a" S.Var.pretty_trace x;
  let w = HM.find_default infl x VS.empty in
  HM.replace infl x VS.empty;
  VS.iter (fun y ->
      if tracing then trace "sol2" "stable remove %a" S.Var.pretty_trace y;
      HM.remove stable y;
      HM.remove superstable y;
      Hooks.stable_remove y;
      if not (HM.mem called y) then destabilize_normal y
    ) w

It seems that destabilize_vs has a few peculiarities:

In a discussion with @michael-schwarz, we came to the conclusion that it is a bit dangerous to have this kind of code duplication, as it is likely that future updates in the destabilize variants will not be reflected in destabilize_vs. It might be best to drop destabilize_vs in favor of a more general destabilize function, which takes an optional callback to notify the caller about any destabilized variables. The caller could then perform the check done in destabilize_vs where it is needed.

sim642 commented 2 months ago

I think the reason we kept it so far is because there's a distinct check there

HM.mem called y || destabilize_vs y || b || was_stable && List.mem_cmp S.Var.compare y vs

The key question would be why that check is different and why that is.

Red-Panda64 commented 2 months ago

As I understand, the check is different because the side_widen: "cycle" strategy wants to know, whether the function destabilized any stable start variables or called variables.

HM.mem called y || destabilize_vs y

is effectively equivalent to

if not (HM.mem called y) then destabilize_vs y else true

So it behaves the same, except that it also returns this boolean. As mentioned in #1434, this boolean whether a start var etc. was destabilized could be computed in an optional callback action (including the remainder of the check || b || was_stable && List.mem_cmp S.Var.compare y vs)

michael-schwarz commented 2 months ago

I had another look, and not calling the Hook seems like it was clearly an oversight.

Not using destab_with_sides on the other hand seems odd at a first glance. However, it seems like destabilize_ref is only setting to destab_with_sides during restarting and destablize_vs is only used from side which only happens during the actual solve

destabilize_ref := destabilize_normal; (* always use normal destabilize during actual solve *)

Maybe it suffices to add the call to the Hook, add these observations as a remark and remove the usage of || got control flow as a more conservative solution than #1434.

What do you think @Red-Panda64 @sim642?

sim642 commented 2 months ago

Adding the hook call isn't controversial I think. I don't understand what's the issue with || though, they both recurse if not in called.

michael-schwarz commented 2 months ago

It's a clean code thing, I find the more explicit

if not (HM.mem called y) then destabilize_vs y else true

to be a lot more readable than the formulation with ||. I'd be surprised if there's any performance differences between both styles.

michael-schwarz commented 2 months ago

But it's also not a hill I'm willing to die on 😉

Red-Panda64 commented 2 months ago

I stand by the opinion that the duplicate code itself is problematic. Namely, that adjustments of either destabilize or destabilize_vs could accidentally not be applied to the other. This does happen, the missing Hook call proves this.

Of course, if you think think the cost of refactoring this outweighs the benefits, we can also just add the Hook and call it a day. As for the short-circuiting ||, I don't see a reason not to make this more explicit, if we are already making small adjustments to destabilize_vs. I think that short-circuiting with side-effects leads to a bit more confusing code whereas removing it should be uncontroversial.

sim642 commented 2 months ago

It's a clean code thing, I find the more explicit

if not (HM.mem called y) then destabilize_vs y else true

to be a lot more readable than the formulation with ||. I'd be surprised if there's any performance differences between both styles.

Ah, that seems reasonable.

I thought there might've been some subtle issue with the recursion not happening in some cases when it should've. For example, reordering the disjuncts could easily cause this (e.g. putting b before the recursive call). They are in the correct order right now, but an explicit if (with a comment indicating the intent) would avoid such issue being introduced in the future.

sim642 commented 2 months ago

As to the code duplication, destabilize_with_side is an even worse offender with very similar loops (only over different sets): https://github.com/goblint/analyzer/blob/06bc1e1e47f760df217ed45387743e82b9488d69/src/solver/td3.ml#L568-L605 Getting rid of the duplication is a bit of a separate issue and should probably cover all the instances then.