capstone-engine / llvm-capstone

llvm with tablegen backend for capstone disassembler
Other
40 stars 18 forks source link

[ASUpdater] Error while Generating Mapping tables for RISCV #46

Open F1o0T opened 9 months ago

F1o0T commented 9 months ago

Hello,

I was successfully able to use the ASUpdater.py script to generate the inc files for Arm. However, when I try the same with RISCV, I am getting the following issue.

python3.11 ./Updater/ASUpdater.py -a RISCV

INFO - Generating Disassembler tables... INFO - Generating AsmWriter tables... INFO - Generating RegisterInfo tables... INFO - Generating InstrInfo tables... INFO - Generating SubtargetInfo tables... INFO - Generating Mapping tables... ILLEGAL VALUE TYPE! UNREACHABLE executed at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/CodeGenTarget.cpp:262! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump:

  1. Program arguments: /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen --printerLang=CCS --gen-asm-matcher -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var LLVM_SYMBOLIZER_PATH to point to it): 0 llvm-tblgen 0x00005636738bac3a 1 llvm-tblgen 0x00005636738bb010 2 llvm-tblgen 0x00005636738b882d 3 llvm-tblgen 0x00005636738ba552 4 libc.so.6 0x00007f4a45efe520 5 libc.so.6 0x00007f4a45f529fc pthread_kill + 300 6 libc.so.6 0x00007f4a45efe476 raise + 22 7 libc.so.6 0x00007f4a45ee47f3 abort + 211 8 llvm-tblgen 0x0000563673809ee3 9 llvm-tblgen 0x00005636734f9107 10 llvm-tblgen 0x0000563673707606 11 llvm-tblgen 0x0000563673708840 12 llvm-tblgen 0x000056367370b923 13 llvm-tblgen 0x000056367336aa4e 14 llvm-tblgen 0x000056367336b0f1 15 llvm-tblgen 0x00005636737904ca 16 llvm-tblgen 0x00005636738e4e97 17 llvm-tblgen 0x0000563673790c8e 18 libc.so.6 0x00007f4a45ee5d90 19 libc.so.6 0x00007f4a45ee5e40 __libc_start_main + 128 20 llvm-tblgen 0x000056367335d195 CRITICAL - Generation failed Traceback (most recent call last): File "/home/projects/capstone-playground/capstone/suite/auto-sync/./Updater/ASUpdater.py", line 229, in Updater.update() File "/home/projects/capstone-playground/capstone/suite/auto-sync/./Updater/ASUpdater.py", line 132, in update self.inc_generator.generate() File "/home/projects/capstone-playground/capstone/suite/auto-sync/Updater/IncGenerator.py", line 97, in generate self.gen_incs() File "/home/projects/capstone-playground/capstone/suite/auto-sync/Updater/IncGenerator.py", line 163, in gen_incs raise e File "/home/projects/capstone-playground/capstone/suite/auto-sync/Updater/IncGenerator.py", line 157, in gen_incs subprocess.run( File "/usr/lib/python3.11/subprocess.py", line 569, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen', '--printerLang=CCS', '--gen-asm-matcher', '-I', '/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include', '-I', '/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV', '/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td']' died with <Signals.SIGABRT: 6>.

I have executed the used llvm-tblgen command independently

/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen --printerLang=CCS --gen-asm-matcher -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td

I got the following error:

ILLEGAL VALUE TYPE! UNREACHABLE executed at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/CodeGenTarget.cpp:262! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump:

  1. Program arguments: /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen --printerLang=CCS --gen-asm-matcher -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var LLVM_SYMBOLIZER_PATH to point to it): 0 llvm-tblgen 0x0000561e2f762c3a 1 llvm-tblgen 0x0000561e2f763010 2 llvm-tblgen 0x0000561e2f76082d 3 llvm-tblgen 0x0000561e2f762552 4 libc.so.6 0x00007fe526a2d520 5 libc.so.6 0x00007fe526a819fc pthread_kill + 300 6 libc.so.6 0x00007fe526a2d476 raise + 22 7 libc.so.6 0x00007fe526a137f3 abort + 211 8 llvm-tblgen 0x0000561e2f6b1ee3 9 llvm-tblgen 0x0000561e2f3a1107 10 llvm-tblgen 0x0000561e2f5af606 11 llvm-tblgen 0x0000561e2f5b0840 12 llvm-tblgen 0x0000561e2f5b3923 13 llvm-tblgen 0x0000561e2f212a4e 14 llvm-tblgen 0x0000561e2f2130f1 15 llvm-tblgen 0x0000561e2f6384ca 16 llvm-tblgen 0x0000561e2f78ce97 17 llvm-tblgen 0x0000561e2f638c8e 18 libc.so.6 0x00007fe526a14d90 19 libc.so.6 0x00007fe526a14e40 __libc_start_main + 128 20 llvm-tblgen 0x0000561e2f205195 [1] 1505619 IOT instruction --printerLang=CCS --gen-asm-matcher -I -I

Here is the generated crash backtrace from gdb.

gdb -q --args /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen --printerLang=CCS --gen-asm-matcher -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td

Reading symbols from /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen... (gdb) run Starting program: /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen --printerLang=CCS --gen-asm-matcher -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/include -I /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Target/RISCV/RISCV.td [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". ILLEGAL VALUE TYPE! UNREACHABLE executed at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/CodeGenTarget.cpp:262!

Program received signal SIGABRT, Aborted. __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737348138880) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: No such file or directory. (gdb) bt

0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737348138880) at ./nptl/pthread_kill.c:44

capstone-engine/capstone#1 pthread_kill_internal (signo=6, threadid=140737348138880) at ./nptl/pthread_kill.c:78 capstone-engine/capstone#2 _GIpthread_kill (threadid=140737348138880, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 capstone-engine/capstone#3 0x00007ffff7a8e476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 capstone-engine/capstone#4 0x00007ffff7a747f3 in __GI_abort () at ./stdlib/abort.c:79 capstone-engine/capstone#5 0x0000555555a23ee3 in llvm::llvm_unreachable_internal (msg=0x555555c02e49 "ILLEGAL VALUE TYPE!", file=0x555555c02dd0 "/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/CodeGenTarget.cpp", line=262) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/Support/ErrorHandling.cpp:212 capstone-engine/capstone#6 0x0000555555713107 in llvm::getEnumName (T=llvm::MVT::INVALID_SIMPLE_VALUE_TYPE) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/CodeGenTarget.cpp:262 capstone-engine/capstone#7 0x0000555555921606 in llvm::(anonymous namespace)::getOperandDataTypes (Op=0x55555940e7e0, OperandType="CS_OP_REG") at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/PrinterCapstone.cpp:3022 capstone-engine/capstone#8 0x0000555555922840 in llvm::(anonymous namespace)::printInsnOpMapEntry (Target=..., MI=std::unique_ptr = {...}, UseMI=true, CGI=0x55556ad7b920, InsnOpMap=..., InsnNum=255, InsnPatternMap=std::map with 0 elements) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/PrinterCapstone.cpp:3160 capstone-engine/capstone#9 0x0000555555925923 in llvm::PrinterCapstone::asmMatcherEmitMatchTable (this=0x5555561dcb40, Target=..., Info=..., StringTable=..., VariantCount=1) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/PrinterCapstone.cpp:3464 capstone-engine/capstone#10 0x0000555555584a4e in AsmMatcherEmitter::run (this=0x7fffffffd160) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/AsmMatcherEmitter.cpp:2254 capstone-engine/capstone#11 0x00005555555850f1 in llvm::EmitAsmMatcher (RK=..., OS=...) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/AsmMatcherEmitter.cpp:2304 capstone-engine/capstone#12 0x00005555559aa4ca in (anonymous namespace)::LLVMTableGenMain (OS=..., Records=...) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/TableGen.cpp:185 capstone-engine/capstone#13 0x0000555555afee97 in llvm::TableGenMain ( argv0=0x7fffffffddec "/home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/build/bin/llvm-tblgen", MainFn=0x5555559aa339 <(anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&)>) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/lib/TableGen/Main.cpp:122 capstone-engine/capstone#14 0x00005555559aac8e in main (argc=8, argv=0x7fffffffda38) at /home/projects/capstone-playground/capstone/suite/auto-sync/llvm-capstone/llvm/utils/TableGen/TableGen.cpp:296 (gdb)

I have added a printf("InsnNum: %d\n", InsnNum); inside the following for loop in the function asmMatcherEmitMatchTable:

  for (const CodeGenInstruction *CGI : Target.getInstructionsByEnumValue()) {
    auto MI = std::make_unique<MatchableInfo>(*CGI);
    bool UseMI = true;
    MI->tokenizeAsmString(Info, Variant);

    // Ignore "codegen only" instructions.
    if (CGI->TheDef->getValueAsBit("isCodeGenOnly") ||
        MI->AsmOperands.empty()) {
      UseMI = false;
      MI->Mnemonic = "invalid";
    } else
      MI->Mnemonic = MI->AsmOperands[0].Token;
    printInsnNameMapEnumEntry(Target.getName(), MI, InsnNameMap, InsnEnum);
    printFeatureEnumEntry(Target.getName(), Info, CGI, FeatureEnum,
                          FeatureNameArray);
    printOpPrintGroupEnum(Target.getName(), CGI, OpGroups);

-->printInsnOpMapEntry(Target, MI, UseMI, CGI, InsnOpMap, InsnNum,
                        InsnPatternMap);
    printInsnMapEntry(Target.getName(), Info, MI, UseMI, CGI, InsnMap, InsnNum,
                      PPCFormatEnum);
-->printf("InsnNum: %d\n", InsnNum);
    ++InsnNum;
  }

the last printed InsnNum is 254before the crash occures inside printInsnOpMapEntry

Any idea what could have gone wrong? Thanks in advance!

Rot127 commented 9 months ago

Yes, RISC-V is one of the difficult architectures to add. Because it wasn't refactored yet to use Auto-Sync. This won't be the only crash you will encounter I think. There are still some unhandled cases.

This one it is apparently the simple-type of an operand, which has no enum string (for whatever reason). The simple types of operands are added to the operand mapping tables (here ARM as example, check the CS_DATA_TYPE_... values). printInsnOpMapEntry emits such an entry in the mapping tables, this is why it crashes there.

If you want to continue to develop on RISC-V, you can add an if clause and skip it for RISC-V. Just emit CS_DATA_TYPE_LAST instead of the enum name. Be aware though, that this is really not an easy task. RISC-V seemed rather complex the last time I casually looked at it. So plan in more time.

Let me know when you've made a decision. So we can add you to https://github.com/capstone-engine/capstone/issues/2015 as RISC-V dev.

For the crashes, you can install the llvm-symbolizer to resolve the addresses to symbol names. No need for gdb.

XVilka commented 1 week ago

See the most recent work on this in https://github.com/capstone-engine/capstone/pull/2498