rescript-association / reanalyze

Experimental analyses for ReScript and OCaml: globally dead values/types, exception analysis, and termination analysis.
MIT License
277 stars 20 forks source link

When a function is annotated `@dead` values inside it are reported dead. #156

Closed sim642 closed 2 years ago

sim642 commented 2 years ago

Running reanalyze https://github.com/rescript-association/reanalyze/commit/389dd682223201b42c33e5c444e5c79e0805adad on Goblint https://github.com/goblint/analyzer/commit/a544002114324a207df72f9e79eb656c3ebe7b5b revealed the following.

Going through the following steps:

  1. dune build @check @all to get cmt files.

  2. reanalyze.exe -write -dce-cmt _build/default/ prints:

    Analysis reported 512 issues (Warning Dead Module:20, Warning Dead Type:31, Warning Dead Value:370, Warning Dead Value With Side Effects:13, Warning Redundant Optional Argument:32, Warning Unused Argument:46)
  3. dune build @check @all to get cmt files with written attributes.

  4. reanalyze.exe -write -dce-cmt _build/default/ prints:

    Analysis reported 164 issues (Warning Dead Module:8, Warning Dead Type:1, Warning Dead Value:30, Warning Dead Value With Side Effects:47, Warning Redundant Optional Argument:32, Warning Unused Argument:46)

Even though the first analysis adds dead code attributes, the second analysis just keeps on adding more. For example:

  1. At the code from #155, a duplicate attribute is added:
    include (Batteries : module type of Batteries with module Format := Batteries.Format [@dead "Ana.result.Error"]  [@dead "Ana.result.Error"] )
  2. At most other places, some definitions inside an already dead-attributed definition are marked additionally. For example, here the attribute is added to oc, despite it being contained in print_to_file, which has the attribute from the first run:
    let print_to_file (fileName: string) (fileAST: file) =
      let oc = Stdlib.open_out fileName [@@dead "+oc"]  in
      dumpFile defaultCilPrinter oc fileName fileAST [@@dead "+print_to_file"]

    Everything inside dead-attributed definitions should be assumed dead by default, without getting additional warnings.

cristianoc commented 2 years ago

If the annotations added are incorrect, there's no way this is going to be idempotent. So this seems another instance of that other issue with adding annotations. If both: 1 the annotations are added correctly, and 2 the result is not idempotent. Then it would be interesting to see a small repro example. Otherwise, this can be closed in favour of the other existing issue.

sim642 commented 2 years ago

The case of #155 is the odd one out here. I think all the other cases were such that if an unused function was correctly annotated with dead by reanalyze, then on the second run it finds some variable inside that function as being dead. Which of course it also is, but it would be very inconvenient to add dead annotations to everything inside a dead-annotated function.

cristianoc commented 2 years ago

There's some logic that if a function is reported dead, then the values inside it are not reported. Technically, if the function is annotated @dead then it is not reported, so the logic does not apply. So this is indeed a violation of idempotency.

I thought this was done for a reason, but not sure might be bad memory.

As an aside, in interactive use, you would annotate the function as @live, which does not give this problem.

cristianoc commented 2 years ago

I think I now remember that idempotency is tested by running a ppx which deletes the entire function when annotated dead, so that's the sense in which it is idempotent. The values inside won't be reported as they disappear.

So in this sense, the process makes sense if one proceeds by either:

I think this process is reasonable, so I'd be inclined to consider this as a non-issue.

cristianoc commented 2 years ago

So I would say that the issue is about documenting what the intended process is, rather than the current behaviour.

cristianoc commented 2 years ago

From the readme:

The main difference between @dead and @live is the transitive behaviour: @dead values don't keep alive values they use, while @live values do.

sim642 commented 2 years ago

If the intended process only involves using @live or deleting code (so no annotation is necessary at all), then the @dead annotation shouldn't be used at all (other than the output of -write I suppose). But the README also clearly describes usage for @dead:

@dead suppresses reporting on the value/type, but can also be used to force the analysis to consider a value as dead. Typically used to acknowledge cases of dead code you are not planning to address right now, but can be searched easily later.

According to that, there is legitimate reason for keeping @dead annotations around. Blindly changing them to @live just to avoid the non-idempotency issue creates further semantic confusion as the use for @live is described for FFI, etc reasons.

I understand the difference of the two annotations as to their transitive behavior, but there's a subtlety in the phrase "values they use". For example, in the snippet from above:

let print_to_file (fileName: string) (fileAST: file) =
  let oc = Stdlib.open_out fileName [@@dead "+oc"]  in
  dumpFile defaultCilPrinter oc fileName fileAST [@@dead "+print_to_file"]

It's about a dead function print_to_file containing something dead (oc). But the inner definition is dead by virtue of the outer one, so the annotation on it is excessive. This is why I think the dead code analysis should consider the situation of containment specially/differently. Of course the situation would be different if oc was outside print_to_file and in that case I agree with the @dead vs @live distinction regarding transitivity of usage.

Or if in such dead containment case it is really expected to use the @dead annotation on both (or even more if there are more inner definitions), then the idempotent/consistent behavior would be for -write to add annotations to both on the first run when neither has any annotation.

cristianoc commented 2 years ago

It's pretty tempting to remove the entire concept of @dead annotation and just suggest to comment-out the code.

Except, one loses the fact that annotated code still compiles and does not code rot while refactoring. Plus, the ppx has been used as a form of automatic dead code elimination when vendoring libraries or other code.

cristianoc commented 2 years ago

for -write to add annotations to both on the first run when neither has any annotation.

That would be pretty noisy.

Looks like proper idempotency is the way to go. It's only a minor code complication where dead-annotated declarations are not filtered out early on, but survive until it's time to report things. And they won't be reported but be used to silence inner reports.

cristianoc commented 2 years ago

This should take care of it: https://github.com/rescript-association/reanalyze/pull/160 There's not really a test for it, as tests run under the ppx, but seems to work.