NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.18k stars 5.91k forks source link

SleighDevTools support #791

Closed mumbel closed 5 years ago

mumbel commented 5 years ago

Noticed the first extension to be added, SleighDevTools.

In one of the files it lists a few processors to be able to run tests on, and I noticed RISCV at the very bottom. Does NSA have planned/unreleased release of this processor? (missed it if this was stated anywhere)

I have been working on one and when I saw that commit today I went ahead and got it into semi-decent shape and committed https://github.com/mumbel/ghidra/commit/9a4a24a8ee1d53263c3fdcd0c68e084b67ffae4a

Just curious if I should keep working on it or not?

Besides that, it looks like a pretty nice extension, will hopefully try it out soon

emteere commented 5 years ago

The reason some unsupported processors may be mentioned in the scripts and test binaries are for future development plans. When the scripts were developed they were developed with an eye towards getting any and all test binaries, even for processors we didn't yet have support for.

The current state of the RISCV is planned with only an initial very rough prototype. What you have done is much further along. You've done quite a bit of the ground work layout, capturing of artifacts of the processor, and instruction decodes. There is obvious interest in the RISCV processor by the community. I've seen mention on twitter of another implementation as well. So I would say continue, and if time permits, we can help you out.

mumbel commented 5 years ago

Thanks @emteere
Looks like someone else committed recently too ( @lcq2 ), hadnt seen any other work done yet, but it does make for a good ISA to work with.

emteere commented 5 years ago

I hadn't seen any implementation yet. Saw @lcq2 mention it. Unless I'm missing something

mumbel commented 5 years ago

https://github.com/lcq2/griscv

mumbel commented 5 years ago

Tried playing with SleighDevTools today but didn't have any real luck inside Ghidra. Read down to pcodetest and built build-PCodeTest-RISCV64G_GCC_O0_pcodetest. Found a missing instruction using build-PCodeTest-RISCV64G_GCC_O0_pcodetest.out at the very least, and nice to not have ELF relocations to worry about (if that was on purpose?).

gp was not happy and never got assumed, since it never got assigned in any instructions, verified by RISCV64G_GCC_O0_pcodetest.d, but the ELF has a value for gp from

./RISCV64G_GCC_O0_pcodetest.readelf:  1056: 00000000000431f8     0 NOTYPE  GLOBAL DEFAULT  ABS __global_pointer$
./RISCV64G_GCC_O0_pcodetest.nm:00000000000431f8 A __global_pointer$

I must be missing some cspec XML or what I do have is broken

mumbel commented 5 years ago

@emteere Got the JUnit working for my RISC-V 64-bit (rv64gc). This looks like its going to be a pretty great feature once 9.1 is released. Though I'm guessing its broken for my current setup, I'll need to try x86_64 for comparison.
https://gist.github.com/mumbel/c62bbf4c9c348d09ce237a1884778ff7

I'm wondering if its the gp issue. Can the processor's elf extension set a register for the whole program? I'm attempting to use evaluateElfSymbol to set gp to address when the symbol is __global_pointer$ but still figuring out the API to get this accomplished... or is this not the best approach?

ryanmkurtz commented 5 years ago

@mumbel, just wanted to let you know you might not get @emteere responses until next week.

mumbel commented 5 years ago

Thanks for the heads up, forget about black hat/defcon starting this week and you guys might have some downtime.

ghidra1 commented 5 years ago

We usually rely upon a processor-specific analyzer for global gp setting when a symbol exists to aide in the decision. Example: the MipsAddressAnalyzer.java shows this for a symbol named _mips_gp_value. The ELF extension mechanism is useful when header artifacts must be processed prior to relocation processing. In addition, the ELF extension must perform any pseudo linking which may be necessary (MIPS and PowerPC-64 are good examples of this). In general we try to defer to the analysis phase as much as possible. Analysis options can be used to control optional behaviors.

mumbel commented 5 years ago

Ah, makes sense. Thanks @ghidra1 , will look into adding the analysis code instead of extension

edit: Still no luck progressing with pcode test, but great tip on the GP, thanks. I got that working for the RVGC_GCC_O4_pcodetest.out as a test binary and stubbed out some analyzer code.

mumbel commented 5 years ago

Would there be interest in pull requests here or too volatile on your side.

The code re-use stuck out when looking at BIOPS_BODY.c https://github.com/mumbel/ghidra/commit/ded1e8a51e821172214763483d10822e81f49c6b

Also, easily avoided for riscv since gcc has that toggle, but what about processors without mul/div?

mumbel commented 5 years ago

Finally made some progress. I didn't think about just running individual tests, which produces the PCODE debug prints, which was so helpful and wish I realized to do this sooner.

EmulatorHelper::executeInstruction() is getting an error thrown from UniqueMemoryBank::getChunk() because value.length != size causing the execution to break before hitting any kind of assert/test.

The instruction that was triggering this action was C.ADDIW which adds a non-zero sign-extended 6-bit immediate to the value from the 64-bit destination register and then writes the value to the destination register as a 32-bit value, sign-extended to 64-bits. I was incorrectly casting this register down to 32-bit first (getChunk() was comparing 8 != 4).

But great news after fixing that: BIOPS: ERROR Unexpected number of assertions ( Passed: 269 Failed: 88 Expected Assertions: 333 )

emteere commented 5 years ago

Great progress! Was curious how it was going. When I last checked it looked like you had a large portion of the processor instructions implemented. Apologize for not getting back sooner.

Using a special ELF loader to set the GP for programs loaded into Ghidra is a good idea for loading ELF files in general. But as Ghidra1 said, setting the GP in the emulator is the usual method. Also setting the stack pointer to an unused location in memory.

Using the individual files works well. There are methods for compiling without certain features of the language, but they can be cumbersome. We still need to merge in the changes for the test build scripts for the TriCore. The individual .out files in a test sub-directory was needed for that processor as well. Mostly because of the cumbersome link process.

Have you tried the Eclipse Sleigh Editor yet?

mumbel commented 5 years ago

I was wondering about sp actually. It looks the emulator defaults to 0x0, which underflows immediately and hopefully isn't an issue. I did end up changing that default to 0x10000 for this test (didn't try reverting this to see it's effect on a working test run), but need to revert and get committed to my emulator code sounds like.

I have not tried the editor yet, but now that I think I have this going I will give that a try working through these failed asserts

mumbel commented 5 years ago

I'm having no luck with the instructions to build the Sleigh editor. Not sure if I'm even doing it correctly to this point, but Navigate to the 'Eclipse SleighEditor'? I'm guessing something broke or I'm just plain doing it wrong.

I did manage to get a new instance of eclipse running, so I might be on the right path, but then editing a sinc file resulted in: Plug-in ghidra.xtext.sleigh.ui was unable to load class ghidra.xtext.sleigh.ui.SleighExecutableExtensionFactory

edit: just to add, my riscv is current with master

https://gist.github.com/mumbel/535c978b039ab3379a1f7a041b4ead58

mumbel commented 5 years ago

Down to only misc and parameters test not passing, everything else is working! I'm a fan of this setup, great feature overall... plus a nice introduction to emulator code/usage.

I did have a question on some of the tests, for example pcode_i1_complexLogic(0, 43, (i1) -140, 0, (i1) -182, 135), while it all works itself out by the compiler and passes just fine, it seems pointless to be giving values outside of [INT8_MIN, INT8_MAX] in these tests (also the other types' ranges). While it is still testing quite a bit, it may not be testing the intended amount of complexity anytime a cast is used in the .test files (to hide the overflow warnings).

I also see what you meant about having smaller test files

emteere commented 5 years ago

SP should definitely be set to a good location and not zero, but only for the emulation test. I don't believe the C-startup initialization routines are run.

mumbel commented 5 years ago

Do you have any pointers in code for setting SP? I searched a bit through MIPS and ARM and hopefully my grepping isn't that bad, but I came up empty. I have been running with just 0x0, and it underflows, haven't noticed any bad side effects yet.

pcode_conversions is a beast. For most of my 64-bit configurations (haven't started 32-bit) this is the only test that is failing. Still need to work through it (return 21;)

One configuration is giving me more trouble than others, RV64G, and one instruction I'm just not having any luck with

BIOPS_DOUBLE: >> ram:0001a378 fld fs0,-0x4e8(a5)
BIOPS_DOUBLE:  Write unique:00000100:8=0xfffffffffffffb18
BIOPS_DOUBLE:  Read fs0=0x0000000000000000
BIOPS_DOUBLE:  Write unique:000000a0:8=0x0000000000000000
BIOPS_DOUBLE:  Read unique:00000100:8=0xfffffffffffffb18
BIOPS_DOUBLE:  Read a5=0x0000000000036000
BIOPS_DOUBLE:  Write unique:00000af0:8=0x0000000000035b18
BIOPS_DOUBLE:  Read unique:00000af0:8=0x0000000000035b18
BIOPS_DOUBLE:  Read ram:00035b18:8=0x40091eb851eb851f
BIOPS_DOUBLE:  Write unique:000000a0:8=0x40091eb851eb851f
BIOPS_DOUBLE:  Write pc=0x000000000001a37c
# fld D,o(s) 00003007 0000707f QWORD|DREF (0, 8)
:fld frdD,immI(rs1) is immI & frdD & rs1 & op0001=0x3 & op0204=0x1 & op0506=0x0 & funct3=0x3
{                                                                                                                    
  local ea:$(XLEN) = immI + rs1;
  frdD = *[ram]:8 ea;
}

I would expect a Write fs0 or maybe a read from unique:000000a0:8 but it just moves onto the next instruction with an empty register value

BIOPS_DOUBLE: >> ram:0001a37c fmv.d fa1,fs0
BIOPS_DOUBLE:  Read fs0=0x0000000000000000
BIOPS_DOUBLE:  Write unique:000000b0:8=0x0000000000000000
BIOPS_DOUBLE:  Read fa1=0x0000000000000000
BIOPS_DOUBLE:  Write unique:000000a0:8=0x0000000000000000
BIOPS_DOUBLE:  Read unique:000000b0:8=0x0000000000000000
BIOPS_DOUBLE:  Write unique:000000a0:8=0x0000000000000000
BIOPS_DOUBLE:  Write pc=0x000000000001a380
ghidra1 commented 5 years ago

Do you have more than one definition of the fld instruction. You can use the Instruction Info popup action to verify the line number of the constructor. This is very fishy since fs0 is read in your trace but should be written in the pcode. My first guess is that the fld slaspec code shown is not what is getting used. Have you displayed the pcode field in the Code Browser listing for address 1a378 (this can be added via the field format for the listing) and compare to your slaspec pcode?

mumbel commented 5 years ago

@ghidra1 thanks for taking a look. I do only have the one fld and verified with DebugSleighInstructionParse/instruction info. also sorry for not stating, this test is at biopEqf8f8_Main for biopEqf8f8(pi,pi).

objdump output just as sanity check:

   1a378:       b187b407                fld     fs0,-1256(a5) # 35b18 <__SDATA_BEGIN__+0x10>
   1a37c:       228405d3                fmv.d   fa1,fs0
   1a380:       22840553                fmv.d   fa0,fs0
   1a384:       204000ef                jal     ra,1a588 <biopEqf8f8>
0001a378 07 b4 87 b1     fld        fs0,-0x4e8(a5=>DAT_00035b18)   = 40091EB851EB851Fh
                                              $U100:8 = COPY 0xfffffffffffffb18:8
                                              $Ua0:8 = COPY fs0
                                              $Uaf0:8 = INT_ADD $U100, a5
                                              $Ua0:8 = LOAD ram($Uaf0)
0001a37c d3 05 84 22     fmv.d      fa1,fs0
                                              $Ub0:8 = COPY fs0
                                              $Ua0:8 = COPY fa1
                                              $Ua0:8 = COPY $Ub0
emteere commented 5 years ago

The SP can be left at zero and will underflow immediately. This should work for processors that the SP is decremented, and then the value assigned. Processors that have the stack at the next available location would need to initialize the SP to the largest value -4. Positive growing stacks would just need to choose a location. Some processors might need the highest part of memory for other things, although I would think the Pcode unit tests would never care about the highest addresses.

A comment in the emulation for the RISCV would be good to note that the SP will start at zero and immediately wrap to higher memory. Alternatively setting the SP to a value of zero and commenting it to note that it has been thought about might be good practice. The setting of the SP was removed from all the processors to simplify them.

; // SP starts at 0x0, and will immediately wrap to highest address at end of memory

To set it, you could use: testRunner.setRegister("SP", 0x0L)

emteere commented 5 years ago

I don't see an assigment to FS0 in your pcode for FLD.

Is the sub-constructor frdD somehow exporting a temp variable, and not the actual FS0? I see it is being read in Pcode into Ua0, and then the result assigned into Ua0.

mumbel commented 5 years ago

@emteere Feel a bit silly after reading the comment in the sub-constructor #TODO dest may be bad. Once again really like this test setup and thanks for the help

mumbel commented 5 years ago

looks like 100% for the general 64-bit processor (rv64g) and with compressed extension (rv64g) and doesn't look like any more false positives. I ended up having quite a few once I fixed that TODO.

mumbel commented 5 years ago

Have you tried the Eclipse Sleigh Editor yet?

@emteere wish i had been able to get that built, but never was able to open any files. Briefly looked at it with the 9.1 BETA, will try to use it now as I'm working on RISC-V pcodetest'ing. Looks pretty nice from a quick glance. Really liked the errors/warnings icon/highlighting that it provides

jstaursky commented 4 years ago

@mumbel how did you fix

Plug-in ghidra.xtext.sleigh.ui was unable to load class ghidra.xtext.sleigh.ui.SleighExecutableExtensionFactory ?

mumbel commented 4 years ago

Sorry, honestly don't remember. Very likely I never did figure it out and just loaded from the 9.1.2 release instead of building on my own.

jstaursky commented 4 years ago

Damn okay thanks anyway.

emteere commented 4 years ago

@jstaursky, can you describe the issue with the SleighEditor you are having, possibly opening another ticket?

The build can be tricky, and sometimes Eclipse / XTEXT does not cooperate It is most likely something in the plugin config when you run out of Eclipse, or build the plugin to add to Eclipse