ClangBuiltLinux / thread-safety-analysis

A research project into applying Clang's Thread Safety Analysis to the Linux Kernel
Other
6 stars 0 forks source link

replace `__acquires(x) __releases(x)` sparse annotations by `__must_hold(x)` #114

Open bulwahn opened 5 years ago

bulwahn commented 5 years ago

During my analysis, I noticed that often __acquires(x) __releases(x) annotations appear in the kernel code, where one could simply use __must_hold(x).

@himanshujha199640 Can you confirm that finding as well? Task: To get some of the created annotations upstream, replace all those occurrences to then eventually allow to use them for clang thread safety analysis: sparse __acquires, __releases and __must_hold could then be used directly for clang annotations and our additional annotations could also be used by sparse.

himanshujha199640 commented 5 years ago

During my analysis, I noticed that often __acquires(x) __releases(x) annotations appear in the kernel code, where one could simply use __must_hold(x).

@himanshujha199640 Can you confirm that finding as well?

Yes, I saw them but most of them were for different locks. Like __acquires(foo) and __releases(bar). But keep in mind that sparse annotation barely cares about its argument

Task: To get some of the created annotations upstream, replace all those occurrences to then eventually allow to use them for clang thread safety analysis: sparse __acquires, __releases and __must_hold could then be used directly for clang annotations

Upstream as in sending a patch to LKML ?

Could you please elaborate ?

and our additional annotations could also be used by sparse.

By sparse ? How ?

bulwahn commented 5 years ago

We can send patches upstream where we replace __releases(x) __acquires(x) with __must_hold(x). A quick grep, with grep -B 2 -A 2 "__releases(" . -R | grep "__acquires(" on the linux source suggest that there could be around 150 cases or so. From my first look, I guess about 50% are with the argument and could be replaced by __must_hold.

We could try writing a coccinelle script for replacing that or extend sparse to warn in those cases etc.

bulwahn commented 5 years ago

@himanshujha199640 what do you think? would you like to work on that? Otherwise, I will put it on my todo list.

himanshujha199640 commented 5 years ago

We can send patches upstream where we replace __releases(x) __acquires(x) with __must_hold(x). A quick grep, with grep -B 2 -A 2 "__releases(" . -R | grep "__acquires(" on the linux source suggest that there could be around 150 cases or so. From my first look, I guess about 50% are with the argument and could be replaced by __must_hold.

We could try writing a coccinelle script for replacing that or extend sparse to warn in those cases etc.

Sure. I remember @JuliaLawall's reply that it would be difficult for spatch to match these annotations since they appear after function signature. But I will try to test and see how it works.

JuliaLawall commented 5 years ago

On Sun, 23 Jun 2019, Himanshu Jha wrote:

  We can send patches upstream where we replace __releases(x)
  __acquires(x) with __must_hold(x).
  A quick grep, with grep -B 2 -A 2 "__releases(" . -R | grep
  "__acquires(" on the linux source suggest that there could be
  around 150 cases or so. From my first look, I guess about 50%
  are with the argument and could be replaced by __must_hold.

  We could try writing a coccinelle script for replacing that or
  extend sparse to warn in those cases etc.

Sure. I remember @JuliaLawall's reply that it would be difficult for spatch to match these annotations since they appear after function signature. But I will try to test and see how it works.

Maybe they will come out in the comments? You can try putting @c after the final ) of the function header and see what comes in the list of comments after the matched token.

julia

himanshujha199640 commented 5 years ago

Built fresh spatch from source and here is what I tried:

himanshu@himanshu-Vostro-3559:~/clang-thread-safety-analysis$ cat foo.c; cat foo.cocci; spatch --sp-file foo.cocci foo.c
static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
         __releases(&cgroup_rstat_lock) __acquires(&cgroup_rstat_lock)
{
    return 0;
}
@r1@
identifier fn;
comments c1;
@@

fn (...)@c1
{
  ...
}

@script:python@
p1 << r1.fn;
c1 << r1.c1;
@@

print p1
print("c1a: ", c1[0].after)
init_defs_builtins: /usr/local/bin/../lib/coccinelle/standard.h
HANDLING: foo.c
cgroup_rstat_flush_locked
('c1a: ', '__releases\n(\n&\ncgroup_rstat_lock\n)\n__acquires\n(\n&\ncgroup_rstat_lock\n)')

Actually, we need to replace we want to do:

- int foo () __releases(x) __acquires(x) {}
+ int foo () __must_hold(x) {}

Doesn't look like comments feature can be helpful here. https://github.com/coccinelle/coccinelle/blob/master/demos/comments.cocci

JuliaLawall commented 5 years ago

On Sun, 23 Jun 2019, Himanshu Jha wrote:

Built fresh spatch from source and here is what I tried:

himanshu@himanshu-Vostro-3559:~/clang-thread-safety-analysis$ cat foo.c; cat foo.cocci; spatch --sp-file foo.cocci foo.c static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep) releases(&cgroup_rstat_lock) acquires(&cgroup_rstat_lock) { return 0; } @r1@ identifier fn; comments c1; @@

fn (...)@c1 { ... }

@script:python@ p1 << r1.fn; c1 << r1.c1; @@

print p1 print("c1a: ", c1[0].after) init_defs_builtins: /usr/local/bin/../lib/coccinelle/standard.h HANDLING: foo.c cgroup_rstat_flush_locked ('c1a: ', 'releases\n(\n&\ncgroup_rstat_lock\n)\nacquires\n(\n&\ncgroup_ rstat_lock\n)')

Actually, we need to replace we want to do:

  • int foo () releases(x) acquires(x) {}
  • int foo () __must_hold(x) {}

Doesn't look like comments feature can be helpful here. https://github.com/coccinelle/coccinelle/blob/master/demos/comments.cocci

I think it is possible. I will try.

julia

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute thethread.[AAD2ZGTYJ7MOF7J7SCFDHW3P35LFJA5CNFSM4H2R3M2KYY3PNVWWK3TUL52HS4DFVRE XG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYK4ABQ.gif]

JuliaLawall commented 5 years ago

The following is less than perfect, because there seems to be no way to add the new annotations in the right place. But perhaps it is helpful.

The parsing of the annotations is also quite fragile. Let me know of problems that arise.

julia

Input file:

int foo1 () releases(x) acquires(x) {}

int foo2 () acquires(x) releases(x) {}

int foo3 () releases(x1) acquires(x) {}

int foo4 () releases(x) releases(y) __acquires(x) {}


Output file:

int foo1 () { annot(__must_hold(x));}

int foo2 () acquires(x) releases(x) {}

int foo3 () releases(x1) acquires(x) {}

int foo4 () { annot(__must_hold(x), __releases(y));}


Semantic patch:

@r@ comments c; identifier f; position p; @@

f@p(...)@c { ... }

@script:ocaml annot@ c << r.c; res; @@

let (bef,mid,aft) = List.hd c in let aft = String.concat "" aft in let pieces = Str.split (Str.regexp ")") aft in let pieces = List.map (fun x -> Str.bounded_split (Str.regexp "(") x 2) pieces in let rec loop i released acquired others = function [] -> if acquired = [] then Coccilib.include_match false else let released = List.map (fun (i,e) -> (i,Printf.sprintf "releases(%s)" e)) released in let acquired = List.map (fun (i,e) -> (i,Printf.sprintf "must_hold(%s)" e)) acquired in let others = List.map (fun (i,e) -> (i,(String.concat "(" e)^")")) others in let all = List.map snd (List.sort compare (List.append released (List.append acquired others))) in res := makeident (Printf.sprintf "annot(%s)" (String.concat ", " all)) | (["releases";x] as a)::rest -> loop (i+1) ((i,x)::released) acquired others rest | (["acquires";x] as a)::rest -> let (same,diff) = List.partition (function (,r) -> x = r) released in (match same with [] -> loop (i+1) released acquired ((i,a)::others) rest | [(i1,)] -> loop (i+1) diff ((i1,x)::acquired) others rest | -> failwith "expect only one requires and release per variable") | a::rest -> loop (i+1) released acquired ((i,a)::others) rest in loop 0 [] [] [] pieces

@@ position r.p; identifier r.f; identifier annot.res; @@

f@p(...

himanshujha199640 commented 5 years ago

Thanks for the script.

Yes, I see the placement of new annotation is difficult but it is very important for us. I also understand the purpose of adding a semicolon after __must_hold but that is wrong.

The order of __acquires or __releases is not important but the argument should be the same which the current script does handle. Although, I don't understand Ocaml but ordering can be handled by just tweaking the script at:

| (["__releases";x] as a)::rest ->
      loop (i+1) ((i,x)::released) acquired others rest
  | (["__acquires";x] as a)::rest ->

I'll try something else and update.

JuliaLawall commented 5 years ago

On Sun, 23 Jun 2019, Himanshu Jha wrote:

Thanks for the script.

Yes, I see the placement of new annotation is difficult but it is very important for us. I also understand the purpose of adding a semicolon after __must_hold but that is wrong.

Not sure to understand. I don't add a semicolon after __must_hold. I put all the annotations in an argument list of a call at the top of the function.

Is it impossible to just adjust them by hand? Or can you see some format that would be easy to handle with sed?

Adding things is easier than matching and removing them, so it should be possible to come up with a better solution that involves only Coccinelle. But not instantly. Maybe by Tuesday.

The order of acquires or releases is not important but the argument should be the same which the current script does handle. Although, I don't understand Ocaml but ordering can be handled by just tweaking the script at:

| (["releases";x] as a)::rest -> loop (i+1) ((i,x)::released) acquired others rest | (["acquires";x] as a)::rest ->

OK, I worked really hard to take the order into account. But that can be fixed.

julia

I'll try something else and update.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.[AAD2ZGRCZENR2CNMPZDHQ3LP35ZELA5CNFSM4H2R3M2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYK6F5Y.gif]

JuliaLawall commented 5 years ago

New version attached.

julia @r@ comments c; identifier f; position p; @@

f@p(...)@c { ... }

@script:ocaml annot@ c << r.c; res; @@

let (bef,mid,aft) = List.hd c in let aft = String.concat "" aft in let pieces = Str.split (Str.regexp ")") aft in let pieces = List.map (fun x -> Str.boundedsplit (Str.regexp "(") x 2) pieces in let rec loop i released acquired others = function [] -> let rec iloop acq both = function [] -> (acq,[],both) | (i,x)::rest -> let (same,diff) = List.partition (fun (i1,x1) -> x = x1) acq in match same with [] -> let (a,r,b) = iloop acq both rest in (a,(i,x)::r,b) | [(i1,x1)] -> iloop diff ((min i i1, x)::both) rest | -> failwith "unexpected duplicate" in let (acquired,released,both) = iloop acquired [] released in if both = [] then Coccilib.include_match false else let released = List.map (fun (i,e) -> (i,Printf.sprintf "releases(%s)" e)) released in let acquired = List.map (fun (i,e) -> (i,Printf.sprintf "acquires(%s)" e)) acquired in let both = List.map (fun (i,e) -> (i,Printf.sprintf "must_hold(%s)" e)) both in let others = List.map (fun (i,e) -> (i,(String.concat "(" e)^")")) others in let all = List.map snd (List.sort compare (List.append released (List.append both (List.append acquired others)))) in res := make_ident (Printf.sprintf "annot(%s)" (String.concat ", " all)) | (["releases";x] as a)::rest -> loop (i+1) ((i,x)::released) acquired others rest | (["__acquires";x] as a)::rest -> loop (i+1) released ((i,x)::acquired) others rest | a::rest -> Printf.eprintf "here?\n"; loop (i+1) released acquired ((i,a)::others) rest in loop 0 [] [] [] pieces

@@ position r.p; identifier r.f; identifier annot.res; @@

f@p(...

int foo1 () { annot(__must_hold(x));}

int foo2 () { annot(__must_hold(x));}

int foo3 () releases(x1) acquires(x) {}

int foo4 () { annot(__must_hold(x), __releases(y));}

int foo5 () { annot(__must_hold(x), __acquires(y));}

int foo6 () { annot(__must_hold(x), __must_hold(y));}

himanshujha199640 commented 5 years ago

Not sure to understand. I don't add a semicolon after __must_hold. I put all the annotations in an argument list of a call at the top of the function.

-static void cgroup_rstat_flush_locked(struct cgroup cgrp, bool may_sleep) -acquires(&cgroup_rstat_lock) releases(&cgroup_rstat_lock) -{ +static void cgroup_rstat_flush_locked(struct cgroup cgrp, bool may_sleep) {

This above output is wrong as:

  1. It should appear before the {.
  2. There should be no semicolon since it is a sparse annotation.

Is it impossible to just adjust them by hand? Or can you see some format that would be easy to handle with sed?

Well, it can adjusted by hand as well. I was looking for an easy automatic adjustment.

I'll try if it can be done with sed.

Adding things is easier than matching and removing them, so it should be possible to come up with a better solution that involves only Coccinelle. But not instantly. Maybe by Tuesday.

OK.

JuliaLawall commented 5 years ago

On Sun, 23 Jun 2019, Himanshu Jha wrote:

Not sure to understand. I don't add a semicolon after __must_hold. I put all the annotations in an argument list of a call at the top of the function.

  • res; <--- is what I'm talking about.

-static void cgroup_rstat_flush_locked(struct cgroup cgrp, bool may_sleep) -acquires(&cgroup_rstat_lock) releases(&cgroup_rstat_lock) -{ +static void cgroup_rstat_flush_locked(struct cgroup cgrp, bool may_sleep) {

  • __must_hold(&cgroup_rstat_lock); return 0; }

This above output is wrong as:

  1. It should appear before the {.
  2. There should be no semicolon since it is a sparse annotation.

Yes, I know these things, but they can't be done at the moment...

julia

Is it impossible to just adjust them by hand? Or can you see some format that would be easy to handle with sed?

Well, it can adjusted by hand as well. I was looking for an easy automatic adjustment.

I'll try if it can be done with sed.

Adding things is easier than matching and removing them, so it should be possible to come up with a better solution that involves only Coccinelle. But not instantly. Maybe by Tuesday.

OK.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute thethread.[AAD2ZGRDUBGTJ57BUVBC723P36F7FA5CNFSM4H2R3M2KYY3PNVWWK3TUL52HS4DFVRE XG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYLANQA.gif]

himanshujha199640 commented 5 years ago

@JuliaLawall The latest scripts now handles the ordering and currently we have 136 replacements to be done. I'll now go manually replace them. At least now I only need to fix the extra semicolon.

@bulwahn Are you sure to replace those 136 places ? Apart from readability, I don't see any real benefit of these replacements.

bulwahn commented 5 years ago

Thanks, @JuliaLawall.

I tried to describe the goal in #137. We do not need to replace all right now, but I consider it helpful towards the goal in #137.