llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.84k stars 11.01k forks source link

DIStringType does not have a way to represent underlying character type. #95440

Open abidh opened 3 weeks ago

abidh commented 3 weeks ago

While adding support for character type in flang-new, I noticed that DIStringType does not have a field to represent the underlying character type of a string. So it can only represent strings with default character types.

Please note that DWARF5 allows DW_AT_type on DW_TAG_string_type. The section 5.11 says, "A string type entry may have a DW_AT_type attribute describing how each character is encoded and is to be interpreted. The value of this attribute is a reference to a DW_TAG_base_type base type entry. If the attribute is absent, then the character is encoded using the system default."

klausler commented 3 weeks ago

how are wide character strings in C++ encoded?

llvmbot commented 3 weeks ago

@llvm/issue-subscribers-flang-ir

Author: Abid Qadeer (abidh)

While adding support for character type in `flang-new`, I noticed that `DIStringType` does not have a field to represent the underlying character type of a string. So it can only represent strings with default character types. Please note that DWARF5 allows `DW_AT_type` on `DW_TAG_string_type`. The section 5.11 says, "A string type entry may have a DW_AT_type attribute describing how each character is encoded and is to be interpreted. The value of this attribute is a reference to a DW_TAG_base_type base type entry. If the attribute is absent, then the character is encoded using the system default."
llvmbot commented 3 weeks ago

@llvm/issue-subscribers-debuginfo

Author: Abid Qadeer (abidh)

While adding support for character type in `flang-new`, I noticed that `DIStringType` does not have a field to represent the underlying character type of a string. So it can only represent strings with default character types. Please note that DWARF5 allows `DW_AT_type` on `DW_TAG_string_type`. The section 5.11 says, "A string type entry may have a DW_AT_type attribute describing how each character is encoded and is to be interpreted. The value of this attribute is a reference to a DW_TAG_base_type base type entry. If the attribute is absent, then the character is encoded using the system default."
abidh commented 3 weeks ago

how are wide character strings in C++ encoded?

C++ can represent its strings as array of characters. So it has more flexibility in representing underlying character type. For fortran strings, DWARF has DW_TAG_string_type. This tag allows DW_AT_type for underlying character type but this was somehow missed when initial support for DIStringType was added in llvm.

There is possibility to use array of character approach with fortran too. The gfortran uses it but I think using DW_TAG_string_type approach will result in more concise DWARF.