TrustInSoft / tis-interpreter

An interpreter for finding subtle bugs in programs written in standard C
565 stars 28 forks source link

Pointer math #103

Closed kroeckx closed 8 years ago

kroeckx commented 8 years ago

I'm getting the following message:

rec_layer_s3.c:204:[value] warning: The following sub-expression cannot be evaluated:
                 (align - (size_t)1) % (unsigned long)8

                 All sub-expressions with their values:
                 size_t  (size_t)1 ∈ {1}
                 unsigned long  (unsigned long)8 ∈ {8}
                 size_t  align - (size_t)1 ∈ {{ (size_t)&__malloc_CRYPTO_malloc_l92_3251[4] }}
                 size_t  align ∈ {{ (size_t)&__malloc_CRYPTO_malloc_l92_3251[5] }}
                 int  1 ∈ {1}
                 int  8 ∈ {8}

I'm guessing that you don't really assign a real address to what malloc returned, preventing you to actually do any calculation with those pointers. It seems we're using a size_t for it, but I guess using an (u)intptr_t will not make any difference.

pascal-cuoq commented 8 years ago

Yes, all addresses remain symbolic constants (&__malloc_CRYPTO_malloc_l92_3251 in your example) even when they are converted to integer types.

There is already a commandline option that can be used to make assumptions about the low bits of pointers and analyze programs without having to change them, -address-alignment 8:

$ cat t.c
typedef unsigned long size_t;

int main(void) {
  char t[10];
  size_t align = (size_t) t+5;
  int x = (align - (size_t)1) & 7;
}
$ tis-interpreter.sh -address-alignment 8 t.c
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
[value] done for function main

There was an oversight when this option was implemented and it doesn't apply to % on addresses (but it should and it will). I am going to fix this very soon. Also, using this option means assuming that addresses are aligned in the first place, which means losing in generality.

pascal-cuoq commented 8 years ago

New behavior:

Hexa:~/tis-interpreter $ cat t.c
#include <string.h>
#include <stdio.h>

int main(void) {
  char t[10];
  size_t align = (size_t) t+5;
  int y = (align - (size_t)1) % 8;

  printf("%d\n", y);
}
Hexa:~/tis-interpreter $ tis-interpreter/tis-interpreter.sh t.c
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
t.c:7:[value] warning: The following sub-expression cannot be evaluated:
                 (align - (unsigned long)1) % (unsigned long)8

                 All sub-expressions with their values:
                 unsigned long  (unsigned long)1 ∈ {1}
                 unsigned long  (unsigned long)8 ∈ {8}
                 size_t  align - (unsigned long)1 ∈ {{ (size_t)&t[4] }}
                 size_t  align ∈ {{ (size_t)&t[5] }}
                 int  1 ∈ {1}
                 int  8 ∈ {8}

                 Stopping
                 stack: main
Hexa:~/tis-interpreter $ tis-interpreter/tis-interpreter.sh t.c -address-alignment 8
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed

4

[value] done for function main
kroeckx commented 8 years ago

Is there a default -address-alignment? I'm just wondering if having it default 1 would make sense and get rid of that message.

pascal-cuoq commented 8 years ago

There is a default -address-alignment and the default value is 1, but that means “do not assume any information about the lowest bits of the integer representations of pointers, that is, assume they are multiples of 1”.

Any value other than one represents an assumption, so it has to be a user decision, although a future version will make it a pointer-by-pointer decision and will infer better default values from types.

pascal-cuoq commented 8 years ago

ptr % c fixed (in the sense that a reasonable workaround is provided as long as c is a small power of two) in 44dc120

pascal-cuoq commented 8 years ago

Sorry that commit broke compilation with non-latest versions of OCaml, I have force-pushed 72384b0 instead that I hope will compile for everyone.

kroeckx commented 8 years ago

So if I try it with set to 8, I end up with this error instead:

bn_exp.c:722:[value] warning: The following sub-expression cannot be evaluated:
                 (unsigned long)powerbufFree & (unsigned long)(64 - 1)

                 All sub-expressions with their values:
                 unsigned long  (unsigned long)(64 - 1) ∈ {63}
                 unsigned long  (unsigned long)powerbufFree ∈ {{ (unsigned long)&__malloc_CRYPTO_malloc_l92_5007 }}
                 int  64 - 1 ∈ {63}
                 unsigned char *  powerbufFree ∈ {{ (unsigned char *)&__malloc_CRYPTO_malloc_l92_5007 }}
                 int  1 ∈ {1}
                 int  64 ∈ {64}

It wants to have a higher alignment than what malloc returns. I think it should even do the right thing in case the alignment is set to 1. I don't know which alignment malloc is going to use and I want tis-interpreter to check that I don't go outside my buffer. If I set the alignment to 64 the error of course goes away because tis-interpreter doesn't have to check anything anymore.

pascal-cuoq commented 8 years ago

I don't know which alignment malloc is going to use

Well, the interpretation of the program is platform-specific (the memory layout of int, long, long long is hardcoded to the choices of GCC for x86-64). This default x86-64 platform on which we are concentrating on ironing out the bugs before suggesting that people use other ones guarantees that all addresses returned by malloc are multiples of 16.

I want tis-interpreter to check that I don't go outside my buffer.

It would be nice to be able to do this, but it seems difficult to reconcile with the trade-offs made in tis-interpreter.