Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
900 stars 203 forks source link

IL String Rendering Ignores Types for Read-Only Sections #2969

Open 0bs3n opened 2 years ago

0bs3n commented 2 years ago

Version and Platform (required):

Bug Description: Ascii string rendering in IL ignores type information for data variables in ReadOnlyDataSectionSematics sections, or a segment with read but not write segment flags if no sections are defined. Any reference to an address which contains a valid ascii string, regardless of whether or not that address has been identified to be of type const char[]/char[]/const char /char , or of some other non string type such as an int, or a struct (with or without a char[] as its first member), will result in the rendering of the string in the IL. See image below for an example. This behavior is primarily a problem with structures in a readOnly section which happen to have a char[] (or otherwise valid ascii) for their first field.

It's also clear that binary ninja does know the correct type for that address, as demonstrated in the image: the struct member accesses are displayed as expected.

This appears to be related to a similar issue, where string rendering is completely disabled when the addresses being referenced reside in a read-write section at the time the view is initialized and loaded. I am still tracking that down though, and will file a separate issue.

Steps To Reproduce: Please provide all steps required to reproduce the behavior:

  1. Open a binary and ensure that there is a section containing ascii strings mapped with ReadOnlyDataSectionSemantics (ELF's .rodata is fine)
  2. Locate a reference to an address of one of these strings, such that the IL is rendering the string
  3. Change the type of the string var to something else, for example an int or struct
  4. Observe that the string is still rendered in the IL

Expected Behavior: Strings should not be rendered in IL if the types of the underlying variable at the "string" address has been explicitly set to not be a string. I think the behavior to default to string rendering when it seems plausible is fine and in fact very useful, but it needs to be overrideable in cases where it was incorrect (for example in the case of the struct with a char[] as its first member).

Screenshots: Correct IL, displayed only when world resides in a read/write section 20220215_09h35m14s_grim Incorrect IL, displayed when world resides in a read-only section 20220215_09h35m53s_grim Definition of the data variable world 20220215_09h36m26s_grim Manually redefinition of world to explicitly not have a char[] as it's first member, no change in IL behavior 20220215_09h36m45s_grim

Additional Information: I'd like to share an example in the form of an x86 ELF bndb file, where I've manually set the .data section to have ReadOnlyDataSectionSemantics for the sake of demonastration; see the first few lines of the main function, just after the call to srand.

zu3st commented 2 years ago

I just encountered a similar problem in connection with a referenced function. The called function prologue contains four pushes, all of which were assembled as valid ASCII. The rendering even ignores any explicitly set symbol name.

My specific problem seems to have been introduced in the development branch, it affects both the newest (3.1.3701-dev) and oldest (3.1.3690-dev) available dev version.

The rendering issue is occurring in MLIL, HLIL and Pseudo-C views.

Expected behavior (3.1.3469):

rendering_expected

Faulty behavior (3.1.3690-dev - 3.1.3701-dev):

rendering_faulty

The respective binary is available on GitHub.