The architecture was refactored in the following ways:
class Regions add the initial segments as Region objects using the ImageObject class instead of the ImageHeader class. This way Region objects can hold a pointer to their parent ImageObject and could perform operations related to Image data internally which helps reduce complexity in upper layer classes like Analyzer.
class DisInfo configures machine/architecture class required for disassembling from passed Insn objects. This way DisInfo objects do not need to provide interfaces to set/get bitness information.
class Insn receives a pointer to its parent ImageObject object via constructor. The ImageObject pointer could come from parent Region or from externally looked up ImageObjects. It is important that Insn objects' life cycle is overlapped with Region split-up operations so the only truly safe parent could be the ImageObject which is basically read-only. As Insn objects could internally query bitness information from their parent ImageObjects and the same way DisInfo could query Insn objects for the same, upper layer callers could be oblivious about bitness information most of the times. Later on Insn objects could also introduce CS, DS, etc. register definitions to construct virtual addresses from linear or segmented offset addresses (the latter is important for 16 bit code segments) internally without exposing these details to upper layer callers.
as some of the classes/structs were considerably changed (behavior or interface wise), it was a good opportunity to reformat the code base using a defined code formatter profile. I used clang-formatter; the related formatter configuration file is also added to the change set.
With this change set 16 bit code segments could be disassembled by the tool and data references pointing into the 16 bit code segment are also identified as far as DS is correctly assumed to be the same as the containing ImageObject.base_address(). The printer does not replace data references within 16 bit code, for now this feature is commented out on a single line as GCC cannot handle 16 bit data references (relocation truncated to fit: R_386_16 against .data" error), but data labels are emitted never the less.
The architecture was refactored in the following ways:
With this change set 16 bit code segments could be disassembled by the tool and data references pointing into the 16 bit code segment are also identified as far as DS is correctly assumed to be the same as the containing ImageObject.base_address(). The printer does not replace data references within 16 bit code, for now this feature is commented out on a single line as GCC cannot handle 16 bit data references (relocation truncated to fit: R_386_16 against .data" error), but data labels are emitted never the less.