gimli-rs / gimli

A library for reading and writing the DWARF debugging format
https://docs.rs/gimli/
Apache License 2.0
848 stars 108 forks source link

DW_AT_frame_base wrong calculation #675

Closed godzie44 closed 11 months ago

godzie44 commented 1 year ago

First of all - thanks to awesome library. I ran into a problem while trying to get function argument data from libc. Raw dwarf looks like this:

< 1><0x000010a3>    DW_TAG_subprogram
                      DW_AT_external              yes(1)
                      DW_AT_name                  __clock_nanosleep
                      DW_AT_decl_file             0x00000001 ./time/../sysdeps/unix/sysv/linux/clock_nanosleep.c
                      DW_AT_decl_line             0x0000003c
                      DW_AT_decl_column           0x00000001
                      DW_AT_linkage_name          __GI___clock_nanosleep
                      DW_AT_prototyped            yes(1)
                      DW_AT_type                  <0x0000007c>
                      DW_AT_low_pc                0x000e57a0
                      DW_AT_high_pc               <offset-from-lowpc> 478 <highpc: 0x000e597e>
                      DW_AT_frame_base            len 0x0001: 0x9c: 
                          DW_OP_call_frame_cfa
                      DW_AT_call_all_calls        yes(1)
                      DW_AT_sibling               <0x00001659>
< 2><0x000010c9>      DW_TAG_formal_parameter
                        DW_AT_name                  clock_id
                        DW_AT_decl_file             0x00000001
                        DW_AT_decl_line             0x0000003c
                        DW_AT_decl_column           0x00000025
                        DW_AT_type                  <0x00000101>
                        DW_AT_location              0x000d3a73
      .debug_loclists offset  : 0x000d3a73
      <debug_loclists offset 0x000d3a73 with 8 entries follows>
   [ 0]<start,end            0x000e57a0 0x000e57de>
                            DW_OP_reg5
   [ 1]<start,end            0x000e57de 0x000e57fd>
                            DW_OP_reg5
   [ 2]<start,end            0x000e5832 0x000e583f>
                            DW_OP_reg5
   [ 3]<start,end            0x000e583f 0x000e5841>
                            DW_OP_entry_value 0x00000001 contents 0x55
                            DW_OP_stack_value
   [ 4]<start,end            0x000e5841 0x000e5850>
                            DW_OP_reg5
   [ 5]<start,end            0x000e5850 0x000e5875>
                            DW_OP_fbreg -116
   [ 6]<start,end            0x000e58c7 0x000e58db>
                            DW_OP_reg5
   [ 7]<end-of-list>

So i try to get value of argument clock_id. Now step by step: 1) my current pc is a E5868 so for get value of clock_id we need to eval this expr: DW_OP_fbreg -116 2) We need a frame base go to this expr from above: DW_AT_frame_base len 0x0001: 0x9c: DW_OP_call_frame_cfa 3) Next, my program calculate CFA and call [Evaluation::resume_with_frame_base] 4) On next evaluation iteration [Evaluation::evaluate] returns EvaluationResult::Complete (it's still ok) 5) [Evaluation::result] return a vec of Piece (with singe Piece) 6) (And at this step i got a wrong result) piece.location contains a [Location::Address] but a [Location::Value] expected. I'm sure what I'm saying (that result must be a [Location::Value]) because I checked the results (in other words, subtracted 116 from the cfa and got the correct value of the variable).

I'm look at sources (https://github.com/gimli-rs/gimli/blob/master/src/read/op.rs#L1977C14-L1977C14) and have a question, why in this situation value interpret as address? maybe it's better to add a new enumerator for such a case (for example [Location::Raw])?

philipc commented 1 year ago

I think you are misunderstanding something somewhere.

DW_AT_location is a location description, and this usually evaluates to an address or register, not a value. There are some opcodes that can result in a value (such as DW_OP_implicit_value), but they are not being used in this case.

The expression DW_OP_fbreg -116 will always result in a pointer to a location on the stack. The clockid parameter to clock_nanosleep has the type clockid_t, which is an integer type, not a pointer, so it can never be equal to the result of the expression DW_OP_fbreg -116.

godzie44 commented 1 year ago

I just shortened this moment, sorry for that. Yes, I wanted to say that the address DW_OP_fbreg -116 points to the place on the stack where the correct clock_id located. Anyway the problem is that this expr

                      DW_AT_frame_base            len 0x0001: 0x9c: 
                          DW_OP_call_frame_cfa

interpreted with gimli [Evaluation] like - "frame base" = *(cfa). But actualy "frame base" = cfa.

philipc commented 1 year ago

Evaluating DW_OP_call_frame_cfa will give you a Location::Address. You should pass that address directly to Evaluation::resume_with_frame_base without dereferencing it.

godzie44 commented 1 year ago

Do you mean that I should interpret Piece::location depending on exactly what value I expect to receive? I thought that Piece::location answered the question "how do I get the value" (and if it is a Location::Address i must deref it)...

philipc commented 1 year ago

Yes. DWARF expressions can be used to compute a value or specify a location. gimli doesn't distinguish between these cases and treats every expression as a location description. You need to distinguish between these yourself based on the attribute (such as DW_AT_frame_base). This isn't ideal, but it works. For location descriptions, the result at the top of the stack is placed in Location::Address, so you can use this as the computed value for other expressions. We could look at adding something that handles this logic for you.

godzie44 commented 1 year ago

Thanks for the explanation.

We could look at adding something that handles this logic for you.

This will be great. Looks like this not a very nice thing for implementing a "generic" evaluator.

philipc commented 1 year ago

Looking into this further, the standard says that DW_AT_frame_base is a location description, so any change we make still wouldn't affect that. The correct solution here is to simply interpret the location based on context.

From 3.3.5 in the DWARF 5 standard: A subroutine or entry point entry may also have a DW_AT_frame_base attribute,22 whose value is a location description that describes the “frame base” for the23 subroutine or entry point. If the location description is a simple register location24 description, the given register contains the frame base address. If the location25 description is a DWARF expression, the result of evaluating that expression is the26 frame base address. Finally, for a location list, this interpretation applies to each27 location description contained in the list of location list entries.