hernanponcedeleon / Dat3M

A verification tool for many memory models
MIT License
77 stars 29 forks source link

How could I analyze project with function or symbols placed in multiple files? #764

Closed h-mole closed 1 hour ago

h-mole commented 2 hours ago

Thanks for this impressively awesome high-performance formal verification tool! However, I had a problem and wondering if it can be solved.

I have two C files in one project to be analysed: // test1.c

int f(int a, int b)
{
    return a + b;
}

// main.c

extern int f(int a, int b);
int main()
{
    return f(1, 2);
}

When I call gcc/clang -o main main.c test1.c,the compiler could find symbols and then compile these two files into one executable. Besides, for some formal tools I have tried, they could find external symbols from other source files.

When I am trying to use Dat3M, I got this problem that Dat3M cannot get symbol f:

command (all files are placed in /src/source):

> java -jar dartagnan/target/dartagnan.jar ./cat/vmm.cat /src/sources/main.c /src/sources/test1.c

output:

[01.11.2024] 12:49:23 [INFO] GitInfo.logGitInfo - Git branch: development
[01.11.2024] 12:49:23 [INFO] GitInfo.logGitInfo - Git commit ID: 347661eb48c6c3906108728cc7e3002d05ee3dc9
[01.11.2024] 12:49:23 [INFO] Dartagnan.main - Program path: /src/sources/main.c
[01.11.2024] 12:49:23 [INFO] Dartagnan.main - CAT file path: ./cat/vmm.cat
[01.11.2024] 12:49:23 [WARN] Compilation.applyLlvmPasses - Failed to run opt (llvm optimizations). Continuing without optimizations.
[01.11.2024] 12:49:23 [INFO] RefinementSolver.run - refinement.baseline: []
[01.11.2024] 12:49:23 [INFO] Wmm.configureAll - encoding.wmm.reduceAcyclicityEncodeSets: true
[01.11.2024] 12:49:23 [ERROR] Dartagnan.main - Unknown intrinsics f
java.lang.UnsupportedOperationException: Unknown intrinsics f
        at com.dat3m.dartagnan.program.processing.Intrinsics.markIntrinsics(Intrinsics.java:302) ~[dartagnan.jar:?]
        at com.dat3m.dartagnan.program.processing.ProcessingManager.lambda$run$0(ProcessingManager.java:161) ~[dartagnan.jar:?]
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?]
        at com.dat3m.dartagnan.program.processing.ProcessingManager.run(ProcessingManager.java:161) ~[dartagnan.jar:?]
        at com.dat3m.dartagnan.verification.solving.ModelChecker.preprocessProgram(ModelChecker.java:66) ~[dartagnan.jar:?]
        at com.dat3m.dartagnan.verification.solving.RefinementSolver.runInternal(RefinementSolver.java:199) ~[dartagnan.jar:?]
        at com.dat3m.dartagnan.verification.solving.RefinementSolver.run(RefinementSolver.java:180) ~[dartagnan.jar:?]

I think I might not be using the right command, to add different sourcefiles into it. Could you please figure out what I am doing wrong, or what I should do to solve it? Thanks for helping me!

ThomasHaas commented 2 hours ago

Unfortunately, we run Clang only on the first provided .c file without linking any code. There are two possible workarounds for now: (1) You avoid extern and just explicitly include all files into main.c and invoke Dartagnan on it. This is the easiest option if possible. (2) You generate a single, linked LLVM file for Dartagnan. AFAIK, this requires two steps: First, you manually run clang on the source files and let it generate LLVM files via

clang <ListOfCSources.c> -Xclang -disable-O0-optnone -S -emit-llvm -g -gcolumn-info

-S -emit-llvm is to generate LLVM code rather than object files, and -g -gcolumn-info is to generate debug metadata so we can report better errors. I think -Xclang -disable-O0-optnone is obsolete nowadays, but it shouldn't hurt to have it in there. Now you invoke the linker llvm-link to link them into a single file:

llvm-link <ListOfLLVMSources.ll> -o <linkedFile.ll>

Finally you can invoke Dartagnan on the linked LLVM file.

@hernanponcedeleon Maybe we should let Dartagnan take multiple files and invoke the linker automatically if needed.

h-mole commented 1 hour ago

Thanks for this solution, really solved my problem!