Open ArcaneNibble opened 5 months ago
Ghidra suports user customization for instruction set extensions like WCH/QingKe. You would have to learn a little about how Ghidra's sleigh system works, patch a few files released Ghidra files under Ghidra/Processors/RISCV
, then run support/sleigh
to install local support for a new RISCV variant (aka 'language' in the sleigh system).
You might break down the issue into steps something like these:
c.lbu
, c.lhu
, c.sb
, and c.sh
opcodes are the same as the Zcb
standard extension opcodes. If so, you want the Ghidra developers to cherry pick https://github.com/thixotropist/ghidra/blob/isa_ext/Ghidra/Processors/RISCV/data/languages/riscv.zcb.sinc into Ghidra. If these opcodes have different encodings than in Zcb, add them to the list of opcodes in the next step instead.c.lbusp
, c.lhusp
, c.sbsp
, c.shsp
opcodes need to be defined in a new file named something like riscv.rv32xw.sinc
. If more than one version of the XW extension set exists, you will want to support that with multiple files or embedded ifdef's.riscv.rv32xw.sinc
and excluding the conflicting double precision floating point extensions defined in riscv.rv32d.sinc
. This will be named something like riscv.lp32qingke.spaspec
. You can use the AliBaba THead extension support under https://github.com/thixotropist/ghidra/tree/isa_ext/Ghidra/Processors/RISCV/data/languages as an example. Note that the THead vendor extensions are recognized by both binutils and gcc, while WCH/QingKe are not (yet?) mainstreamed.riscv.ldefs
to enable the new language definition, so that it can appear as an import option.support/sleigh ...
to generate the sla file your local Ghidra will use to recognize the custom instructions.The final step is to help the Ghidra developers decide how much of this should be included in the Ghidra release everyone gets, how much they should enable with extensions to the support/sleigh
generator, and how much to leave as end-user customization. I am not a developer myself, so I won't second guess their decisions on that.
Thank you, that was very helpful. I've made a PR with my very preliminary implementation.
The vendor toolchain for these cores is here, but the source code to their GCC patches does not seem to be available. If you want only GCC, there are Windows binaries here. If you want macOS binaries, there are some here (direct link). Note that objdump doesn't seem to properly disassemble their own extension (even though GAS accepts it).
Their openocd fork has been successfully GPL lettered out of the MRS multiple times. Should be possible with gcc too.
Is your feature request related to a problem? Please describe.
The WCH/QingKe RISC-V cores have an "XW" extension implementing a small number of additional compressed 16-bit opcodes. Unfortunately, these opcodes conflict with some standard ones, which causes incorrect disassembly and even more incorrect decompilation. I would like to see Ghidra support for this extension.
The opcode encodings don't seem to be properly documented, so I figured them out here
Describe the solution you'd like I'm not at all familiar with how Ghidra works internally, but I'd like to see "XW" as its own variant of RISC-V.
Additional context The manual for the QingKe V4 core can be found here, but it only contains a single sentence about the "XW" extension which doesn't explain anything about how it works in detail.
The vendor toolchain for these cores is here, but the source code to their GCC patches does not seem to be available. If you want only GCC, there are Windows binaries here. If you want macOS binaries, there are some here (direct link). Note that objdump doesn't seem to properly disassemble their own extension (even though GAS accepts it).