Open adinn opened 3 years ago
Hi @adinn Looks promising. I will provide a some info on how to deal with isolates in DWARF debuginfo.
@olpaw Thanks for any tips you can provide. It's not the isolates per se that are the problem but the narrow (heap-base relative) pointer. I believe a similar problem will arise in an image which omits isolates but uses compressed pointers.
I did try to define a narrow pointer type and configure narrow pointer to pointer conversion using a data_location info element in the DWARF type but gdb did not seem to bite on any of the many different ways I attempted to implement it.
Following feedback from people who have experience of the EE product's debug support I have decided to revise the Java -> C++ mapping so it conforms with the one used by EE. That means the Java names will used for the layout types and Java types will appear in the DWARF model as pointers to those layout types. So, for example, gdb will see a C++ class named java.lang.String
and the signature of the concat method will be
java.lang.String *concat(java.lang.String *)
I think this is still going to be clear to any Java user who can drive gdb. It also means that method and field names will have the expected name e.g. Hello::main
instead of _Hello.main
.
I have already modified the prototype implementation accordingly and I am about to update the description text to match.
To represent Java object/array refs we use DW_TAG_pointer_type to DW_TAG_structure_type
dwarf-types where DW_TAG_pointer_type has a different DW_AT_byte_size depending if we have a compressed ref or not. In addition the DW_TAG_structure_type has a DW_AT_data_location that holds a dwarf expression (little dwarf stack-machine programs) that can vary depending on what ref type we want to represent.
For regular refs to objects we just have a dwarf expression that masks out bits that are used by the GC:
If we have base-relative refs we have to use:
Refs to DynamicHubs need special handling as they always reside in the image-heap (effectively read-only). They do not have the GC bits. All in all it's delicate dwarf trickery that is needed to make this work. To see the details please look at the dwarf expressions we generate with -g in EE. Also when I implemented this I debugged the dwarf-expression interpreter bits of GDB with GDB. It helped a lot to understand what is actually computed by a given dwarf expression.
Hi Paul,
Thanks for the advice. I'll try to work that into my current implementation.
Can you explain why you use DWARF structure_type rather than class_type info records? Is that simply to avoid gdb naming methods using classname::methodname syntax? Or is there another reason?
n.b. I have deliberately not dumped and inspected the EE DWARF output for two reasons.
So, much as I appreciate your invitation, until I can formally (i.e. legally) clarify that latter point and be sure not to risk prejudicing the former goal I'm going to have to continue to operate only on openly released information such as public docs or your advice in this channel.
Can you explain why you use DWARF structure_type rather than class_type info records?
The decision to go with DW_TAG_pointer_type to DW_TAG_structure_type
was made long before I had anything to do with
with debuginfo support. I guess back then it was unclear if class_type is a good fit to describe Java (i.e. not C++) classes and if GDB would interpret class_type
exclusively in C++ terms. From todays perspective going with DW_TAG_class_type
seems a like good idea.
Upgrade Debug Info Model with Type Info
This issue proposes an upgrade to the debug info feature to implement some of the capabilities proposed in the original debug info feature request (#1917) but not yet implemented. A prototype implementation exists for gdb (PR to follow) and work is under way to implement equivalent functionality for Windows.
Proposed additions
Issue #1917 specified several desirable aspects of debug info support that were (deliberately) omitted from the original implementation. This issue addresses the following missing aspects:
As a corollary to inclusion of information regarding Java object types it should also be possible to embed debug info detailing Java methods and fields in the debug info for the classes to which they belong:
Expected Benefits
This will provide a much richer debugging experience for anyone wishing to debug a GraalVM native image. With type info included in the debug info output it should be possible to perform the following functions:
Examples of proposed usage in gdb
Using gdb on Linux these debugger capabilties are supported by the prototype implementation as follows:
1) Cast values to specific types
2) Describe types using the ptype command
3) Print instances field by field using the print command
4) Traverse the object network using path expressions
5) Refer to static field data by name
Note that this implementation relies on a mapping of Java types to an underlying C+ type model that is similar enough to allow the program to be debugged (see next subsection). It should be possible to provide many of the same capabilities on Windows by basing the PECOFF debug info on a comparable mapping or by relying on support for Java debugging in the target Windows debugger.
Note also that automatic resolution of program names to associated program values is only proposed for static field names (and method names). Adding location info to allow resolution of parameter and local var names to current live is withheld until the next phase of debug info enhancement.
Constraints on the generated DWARF info model
The above examples show the need for the Linux implementation to work around the fact that gdb does not currently provide support for debugging Java per se. In the long run a suitable DWARF info model for Java needs to be defined with support added to gdb. Pending such a full solution this problem has been resolved by mapping the Java class base to an equivalent C++ model i.e. it fakes what it can.
So,
_
.Problem when using Isolates
The current prototype relies on object references embedded in object fields actually being stored as pointers. This allows gdb to read the field value directly as the address to the linked object. This is the correct model if
-H:-SpawnIsolates
is specified on the command line. With the default setting-H:+SpawnIsolates
field references are actually offsets from the heap register ($r14
on Linux/x86_64). The prototype implementation has not yet identified a means for such base relative pointers to be described to gdb, allowing them to be transformed from offsets to addresses. This problem is still under investigation.