goblint / analyzer

Static analysis framework for C
https://goblint.in.tum.de
MIT License
168 stars 72 forks source link

Rework system outputting warnings #201

Closed Wherekonshade closed 3 years ago

Wherekonshade commented 3 years ago

Categories should be introduced to create a better understanding of the warnings.

198 #199 #200

@vandah @edincitaku

michael-schwarz commented 3 years ago

For this you might want to also take a look at #55 that addresses the same issue

Goals:

  • usability: allow filtering
  • type-safety: avoid having different strings for the same thing
  • testing/traceability: e.g. change leads to less warnings for category x

Implementation:

  • list of tags? e.g. warn maybe [Undefined; Integer] "..."

open type for categories:

  • Behavior
    • Undefined
    • Implementation
    • Machine
  • Integer
    • Overflow
    • Div_by_zero
  • Cast
    • Type_mismatch (bug)
  • Race
  • Array
    • Out_of_bounds of int*int
michael-schwarz commented 3 years ago

Also relevant #36

Wherekonshade commented 3 years ago

There are some things open to discussion for the implementation of this issue. If you have any thoughts on the following please let us know :)

  1. The categories will be based on the a fore mentioned list for a start:
  1. The user should be able to enable/disable each of these categories individually

  2. Reports and warnings should be merged with this new approach.

  3. The new warning will show the category and also a "must" or "may" tag, which will be "may" by default.

5. If possible the warning will be constructed completely at the place of origin

vogler commented 3 years ago

Your formatting misses the structure from #55:

open type for categories:

  • Behavior

    • Undefined
    • Implementation
    • Machine
  • Integer

    • Overflow
    • Div_by_zero
  • Cast

    • Type_mismatch (bug)
  • Race
  • Array

    • Out_of_bounds of int*int

I assume we want to reflect the structure in the type (or some value) to be able to easily filter whole groups. There are open questions:

  1. do we want it flat and unqualified with polymorphic variants or do we want normal variants which might not be as convenient to write but are possibly safer?
  2. do we want the type definition in one place or distributed with extensible variants (does not work with polymorphic variants)?

Untested and not all combinations, but should give some idea:


Extensible variants: e.g. messages.ml:

type group = ..
let is_integer_error = function Integer _ -> true | _ -> false

e.g. intDomain.ml:

type integer_error = Overflow | Div_by_zero
type Messages.group += Integer of integer_error

e.g. base.ml:

open Messages
(* ... *)
warn `Must (Integer IntDomain.Div_by_zero) "..."

Normal variants all in one place: e.g. messages.ml:

type integer_error = Overflow | Div_by_zero
type group = Integer of integer_error (* | ... *)
let is_integer_error = function Integer _ -> true | _ -> false

e.g. base.ml:

open Messages
(* ... *)
warn `Must (Integer Div_by_zero) "..."

Flat polymorphic variants: e.g. messages.ml:

let is_integer_error = function `Overflow | `Div_by_zero -> true | _ -> false

e.g. base.ml:

(* ... *)
Messages.warn `Must `Div_by_zero "..."
sim642 commented 3 years ago

4. The new warning will show the category and also a "must" or "may" tag, which will be "may" by default.

As always, my worry with this is our fabulous PathSensitive2 functor. There has to be some kind of interaction between these must/may tags and multiple paths reaching the same location/node. For example, if a must warning is emitted while analyzing one path, but no warning is emitted while analyzing the other, then overall for that location/node it's actually a may warning. So the lack of a warning also matters for such aggregation. Or what is the intent on dealing with path-sensitivity?

Not sure if this is related to warnings or not, but our assertion result reporting has the same issue. Different results on different paths must meaningfully combine. For example, a must-success on one path and a must-failure on another path make up an unknown result.

5. If possible the warning will be constructed completely at the place of origin

What does this mean?

michael-schwarz commented 3 years ago

As always, my worry with this is our fabulous PathSensitive2 functor. There has to be some kind of interaction between these must/may tags and multiple paths reaching the same location/node. For example, if a must warning is emitted while analyzing one path, but no warning is emitted while analyzing the other, then overall for that location/node it's actually a may warning. So the lack of a warning also matters for such aggregation. Or what is the intent on dealing with path-sensitivity?

This is then compounded by the issue of different contexts as well. One possible way to think about this would be: A warning is a must warning if there is a must warning for all paths in all contexts that are deemed reachable.

Even this however does still not correspond to something that must happen in the concrete: Even in a case without path splitting and only one context, the entire program point may only be reachable due to overapproximation.

However, as a first approach I would suggest the semantics of a must warning to be that there exists at least one path and context where the problem must happen when reaching the program point it in the corresponding context and with a value described by the predecessor.

Wherekonshade commented 3 years ago

5. If possible the warning will be constructed completely at the place of origin

What does this mean?

  1. was just supposed to mean, that we want to add the category and the must or may tag as well as additional info for the output to the user all at the place where the warning happens instead of doing it centrally in messages.ml for example. Sorry for the imprecise formulation.
sim642 commented 3 years ago

I was looking deeper into Messages now for #297 and realized something which is not really addressed by #255. Namely, there's this warnings table: https://github.com/goblint/analyzer/blob/6759b40036fc36cd42a04cee685b5b97e937f8b4/src/util/messages.ml#L9 Warnings get added to this table for HTML and GobView output, in addition to being printed on the terminal: https://github.com/goblint/analyzer/blob/6759b40036fc36cd42a04cee685b5b97e937f8b4/src/util/messages.ml#L16-L18 To my surprise, the warnings table has support for grouped warnings, which are used for data races: https://github.com/goblint/analyzer/blob/6759b40036fc36cd42a04cee685b5b97e937f8b4/src/util/messages.ml#L77-L91 This works as follows in the HTML output: when you click on a line with a race warning, it lists other warnings in the same group there as well, to indicate which other lines its racing with.

What #255 does is just add an advanced mechanism for printing the warnings on the terminal, but none of that new metadata (certainty, nested category) is reflected in the warnings table for further usage. Therefore HTML, GobView or the Magpie integration (if it were to get the warnings properly as JSON/XML, not by regexing the output) wouldn't be able to use that metadata still. Moreover, since print_group is a level below the warn* functions, it completely lacks the support for any of the refactored stuff. So grouped warnings cannot have certainty or categories.

Tbh, print_group is a pretty niche feature: it's just used for data race warnings and all over OSEK. Nevertheless, to eventually have one warnings system, it needs to also support that use case.