dyninst / dyninst

DyninstAPI: Tools for binary instrumentation, analysis, and modification.
http://www.dyninst.org
GNU Lesser General Public License v2.1
725 stars 153 forks source link

FunctionBase getRanges() method appears to miss the range information for some functions #1060

Open wcohen opened 3 years ago

wcohen commented 3 years ago

Intention Extract accurate range information for functions and inlined functions in binaries

Describe the bug In some cases there are no ranges returned for a function despite there being range information in the debug info.

To Reproduce Have little program that uses the getRanges() method recursively on each function and inlined function it contains at https://github.com/wcohen/quality_info/tree/wcohen/funcranges . The program is in dyninsttools directory and prints out the number of range entries in the vector obtained from the getRanges() method. Some are unexpectedly 0 size vectors. Steps to reproduce:

git clone https://github.com/wcohen/quality_info.git git checkout wcohen/funcranges cd quality_info/dyninsttools make ./ranges_sanity ./ranges_sanity |grep " 0" |c++filt

Some of functions are machine code that do not have range information, but will see expand_inlined_process(Dyninst::SymtabAPI::FunctionBase*) 0

Look at the generated code and debuginfo:

dwarfdump ranges_sanity > ranges_sanity.dwarf objdump -d ranges_sanity > ranges_sanity.dis

See that there is sane range information in ranges_sanity.dwarf for two elements:

< 1><0x0002c8a1>    DW_TAG_subprogram
                      DW_AT_name                  expand_inlined_process
                      DW_AT_decl_file             0x00000018
                      DW_AT_decl_line             0x0000000e
                      DW_AT_decl_column           0x00000001
                      DW_AT_ranges                0x000002eb

      Offset of rnglists entries: 0x000002eb
      [ 0] start,end             0x00402740 0x00402867
      [ 1] start,end             0x00402310 0x00402320
      [ 2] end of list                                
                      DW_AT_frame_base            len 0x0001: 0x9c: 
                          DW_OP_call_frame_cfa
                      DW_AT_call_all_calls        yes(1)
                      DW_AT_sibling               <0x0002d109>

The rnglist_entries match up with the expand_inlined_process and associated cold code for it.

Expected behavior Expected the code to return that expand_inlined_process function had two ranges rather than none.

System (please complete the following information):

Additional context Add any other context about the problem here.

wcohen commented 3 years ago

This was using locally built rpm containing the dyinst-11.0.0 release with a couple minor patches to dyninst. I have kicked off a koji scratch build if you need to take a look at the rpm and make sure it isn't one of the patches in there:

https://koji.fedoraproject.org/koji/taskinfo?taskID=70634353

hainest commented 3 years ago

@wcohen @sashanicolas @mxz297

The problem is that Dyninst is picking up the .cold version, but not the regular version.

< 1><0x0002d75d>    DW_TAG_subprogram
      DW_AT_name                  expand_inlined_process
      DW_AT_decl_file             0x00000018 .../quality_info/dyninsttools/ranges_sanity.C
      DW_AT_decl_line             0x0000000e
      DW_AT_decl_column           0x00000001
      DW_AT_ranges                0x000008b0

ranges: 3 at .debug_ranges offset 2224 (0x000008b0) (48 bytes)
  [ 0] range entry    0x00003a10 0x00003b77
  [ 1] range entry    0x00003580 0x00003590
  [ 2] range end      0x00000000 0x00000000

[dwarfWalker.C:496] (0x2d746) Asking for sibling
[dwarfWalker.C:363] (0x2d75d) Parsing entry with context size 3, func (N/A), encl (nil), (not sup), mod:ranges_sanity.C, tag: 2e
[dwarfWalker.C:617] (0x2d75d) parseSubprogram entry
[dwarfWalker.C:780] (0x2d75d) Parsing ranges
[dwarfWalker.C:807] Lexical block from 0x3a10 to 0x3b77
[dwarfWalker.C:807] Lexical block from 0x3580 to 0x3590
[dwarfWalker.C:584] (0x2d75d) Lookup by offset 0x3580 identifies _ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE.cold
[dwarfWalker.C:584] (0x2d75d) Lookup by offset 0x3580 identifies _ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE.cold

0000000000003580 <_ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE.cold>:
0000000000003a10 <_ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE>:

From Will's tool _ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE.cold 2 _ZL22expand_inlined_processPN7Dyninst9SymtabAPI12FunctionBaseE 0

wcohen commented 3 years ago

dyninst is swapping debuginfo associated with the function names for main and main.cold? Looks like same could be happing to expand_inlined_process.

The compiler is going to generate labels with various suffixes in addition to .cold (.irsa, .constprop, etc.). Dyninst is going to need to be careful not to mix up which debuginfo goes with which label.