Open Quuxplusone opened 4 years ago
Bugzilla Link | PR44775 |
Status | CONFIRMED |
Importance | P normal |
Reported by | H.J. Lu (hjl.tools@gmail.com) |
Reported on | 2020-02-04 11:10:17 -0800 |
Last modified on | 2020-02-05 10:41:59 -0800 |
Version | 9.0 |
Hardware | PC Linux |
CC | bd1976llvm@gmail.com, htmldeveloper@gmail.com, i@maskray.me, llvm-bugs@lists.llvm.org, smithp352@googlemail.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
cat > a.s <<e
.section .text,"ax",%progbits
.byte 0
.section .text,"awx",%progbits
.byte 1
.section .text,"awx",%progbits
.byte 2
e
% as a.s
a.s: Assembler messages:
a.s:3: Warning: ignoring changed section attributes for .text
a.s:5: Warning: ignoring changed section attributes for .text
I think llvm-mc should emit a warning.
For
.section .foo,"ao",%progbits,yyy
.dc.a 0
.section .foo,"ao",%progbits,xxx
.dc.a 0
gas/config/obj-elf.c:obj_elf_change_section should also emit a warning that
"o"+xxx is ignored.
.section .text,"axG",%progbits,foo,comdat
foo:
.byte 0
.section .text,"axG",%progbits,bar,comdat
bar:
.byte 1
Both GNU as and llvm-mc create two ".text" sections.
Both section name and group name are considered.
static bfd_boolean
get_section (bfd *abfd ATTRIBUTE_UNUSED, asection *sec, void *inf)
{
struct elf_section_match *match = (struct elf_section_match *) inf;
const char *gname = match->group_name;
const char *group_name = elf_group_name (sec);
unsigned int info = elf_section_data (sec)->this_hdr.sh_info;
return (info == match->info
&& ((bfd_section_flags (sec) & SEC_ASSEMBLER_SECTION_ID)
== (match->flags & SEC_ASSEMBLER_SECTION_ID))
&& sec->section_id == match->section_id
&& (group_name == gname
|| (group_name != NULL
&& gname != NULL
&& strcmp (group_name, gname) == 0)));
}
It seems llvm-mc should take sh_link into consideration as well. I will work on
a patch.
(In reply to Fangrui Song from comment #2)
> .section .text,"axG",%progbits,foo,comdat
> foo:
> .byte 0
> .section .text,"axG",%progbits,bar,comdat
> bar:
> .byte 1
>
> Both GNU as and llvm-mc create two ".text" sections.
> Both section name and group name are considered.
>
> static bfd_boolean
> get_section (bfd *abfd ATTRIBUTE_UNUSED, asection *sec, void *inf)
> {
> struct elf_section_match *match = (struct elf_section_match *) inf;
> const char *gname = match->group_name;
> const char *group_name = elf_group_name (sec);
> unsigned int info = elf_section_data (sec)->this_hdr.sh_info;
>
> return (info == match->info
> && ((bfd_section_flags (sec) & SEC_ASSEMBLER_SECTION_ID)
> == (match->flags & SEC_ASSEMBLER_SECTION_ID))
> && sec->section_id == match->section_id
> && (group_name == gname
> || (group_name != NULL
> && gname != NULL
> && strcmp (group_name, gname) == 0)));
> }
>
> It seems llvm-mc should take sh_link into consideration as well. I will work
> on a patch.
My patch gas changed it to
static bfd_boolean
get_section (bfd *abfd ATTRIBUTE_UNUSED, asection *sec, void *inf)
{
struct elf_section_match *match = (struct elf_section_match *) inf;
const char *gname = match->group_name;
const char *group_name = elf_group_name (sec);
const char *linked_to_symbol_name
= sec->map_head.linked_to_symbol_name;
unsigned int info = elf_section_data (sec)->this_hdr.sh_info;
return (info == match->info
&& ((bfd_section_flags (sec) & SEC_ASSEMBLER_SECTION_ID)
== (match->flags & SEC_ASSEMBLER_SECTION_ID))
&& sec->section_id == match->section_id
&& (group_name == gname
|| (group_name != NULL
&& gname != NULL
&& strcmp (group_name, gname) == 0))
&& (linked_to_symbol_name == match->linked_to_symbol_name
|| (linked_to_symbol_name != NULL
&& match->linked_to_symbol_name != NULL
&& strcmp (linked_to_symbol_name,
match->linked_to_symbol_name) == 0)));
}
(In reply to Fangrui Song from comment #1)
>
> For
> .section .foo,"ao",%progbits,yyy
> .dc.a 0
> .section .foo,"ao",%progbits,xxx
> .dc.a 0
>
> gas/config/obj-elf.c:obj_elf_change_section should also emit a warning that
> "o"+xxx is ignored.
It is an error for gas:
[hjl@gnu-cfl-1 tmp]$ cat b.s
.section .foo,"ao",%progbits,yyy
.dc.a 0
.section .foo,"ao",%progbits,xxx
.dc.a 0
[hjl@gnu-cfl-1 tmp]$ as b.s
b.s: Assembler messages:
b.s: Error: undefined linked-to symbol `yyy' on section `.foo'
b.s: Error: undefined linked-to symbol `xxx' on section `.foo'
[hjl@gnu-cfl-1 tmp]$
Speaking too quickly...
SHF_LINK_ORDER + symbol is a bit different from SHF_GROUP+signature name.
A group signature is a string literal. It can easily used as a key
(differentiator). The symbol used by "o" needs resolving. For the example
below, shall we consider the second .section the same section?
.data
foo:
bar:
.section .text,"axo",%progbits,foo
.section .text,"axo",%progbits,bar
My current feeling is that we should use explicit syntax to make it clear that
the sections are different.
.section .text,"axo",%progbits,foo,unique,0
.section .text,"axo",%progbits,bar,unique,1
(In reply to Fangrui Song from comment #5)
> Speaking too quickly...
>
> SHF_LINK_ORDER + symbol is a bit different from SHF_GROUP+signature name.
>
> A group signature is a string literal. It can easily used as a key
> (differentiator). The symbol used by "o" needs resolving. For the example
> below, shall we consider the second .section the same section?
>
>
> .data
> foo:
> bar:
> .section .text,"axo",%progbits,foo
> .section .text,"axo",%progbits,bar
These may be 2 different sections, depending on where foo and bar
are defined. Gas generates 2 different sections. Linker can handle it
correctly.
> My current feeling is that we should use explicit syntax to make it clear
> that the sections are different.
>
> .section .text,"axo",%progbits,foo,unique,0
> .section .text,"axo",%progbits,bar,unique,1
This isn't required.
From https://sourceware.org/ml/binutils/2020-02/msg00028.html
- && strcmp (group_name, gname) == 0)));
+ && strcmp (group_name, gname) == 0))
+ && (linked_to_symbol_name == match->linked_to_symbol_name
+ || (linked_to_symbol_name != NULL
+ && match->linked_to_symbol_name != NULL
+ && strcmp (linked_to_symbol_name,
+ match->linked_to_symbol_name) == 0)));
It seems the patch just considered the symbol (linked_to_symbol) name, not the
referenced section.
> .data
> foo:
> bar:
> .section .text,"axo",%progbits,foo
> .section .text,"axo",%progbits,bar
So in the example, the two .section directives define different sections. I
think the semantic is fine.
> These may be 2 different sections, depending on where foo and bar
are defined. Gas generates 2 different sections.
I hope we can change "may" to "must"...
Different symbols -> different sections, even if their section is the same.
Same symbols -> same section
(In reply to Fangrui Song from comment #7)
> From https://sourceware.org/ml/binutils/2020-02/msg00028.html
>
> - && strcmp (group_name, gname) == 0)));
> + && strcmp (group_name, gname) == 0))
> + && (linked_to_symbol_name == match->linked_to_symbol_name
> + || (linked_to_symbol_name != NULL
> + && match->linked_to_symbol_name != NULL
> + && strcmp (linked_to_symbol_name,
> + match->linked_to_symbol_name) == 0)));
>
> It seems the patch just considered the symbol (linked_to_symbol) name, not
> the referenced section.
The section is resolved at the very end:
if (sec->map_head.linked_to_symbol_name)
{
symbolS *linked_to_sym;
linked_to_sym = symbol_find (sec->map_head.linked_to_symbol_name);
if (!linked_to_sym || !S_IS_DEFINED (linked_to_sym))
as_bad (_("undefined linked-to symbol `%s' on section `%s'"),
sec->map_head.linked_to_symbol_name,
bfd_section_name (sec));
else
elf_linked_to_section (sec) = S_GET_SEGMENT (linked_to_sym);
}
> > .data
> > foo:
> > bar:
> > .section .text,"axo",%progbits,foo
> > .section .text,"axo",%progbits,bar
>
> So in the example, the two .section directives define different sections. I
> think the semantic is fine.
>
> > These may be 2 different sections, depending on where foo and bar
> are defined. Gas generates 2 different sections.
>
> I hope we can change "may" to "must"...
Yes, this is what I implemented.
> Different symbols -> different sections, even if their section is the same.
> Same symbols -> same section
Yes.
> > > .data
> > > foo:
> > > bar:
> > > .section .text,"axo",%progbits,foo
> > > .section .text,"axo",%progbits,bar
> >
> > So in the example, the two .section directives define different sections. I
> > think the semantic is fine.
> >
> > > These may be 2 different sections, depending on where foo and bar
> > are defined. Gas generates 2 different sections.
> >
> > I hope we can change "may" to "must"...
>
> Yes, this is what I implemented.
> > Different symbols -> different sections, even if their section is the same.
> > Same symbols -> same section
>
>
> Yes.
Great! We are on the same page. Created https://reviews.llvm.org/D74006
For the missing diagnostics, created https://reviews.llvm.org/D73999
Though I did not the join the design of SHF_LINK_ORDER "o" "unique" etc,
I find them very useful.
The lack of the features on binutils side always made me a bit worried about
using such metadata sections more heavily.
Thanks a lot for taking the lead!
(Nick sent me an email that he did not forget --gc-sections feature requests
last November but it seems that he forgets it again...)
Hi H.J. Lu and Fangrui Song,
Thanks for your work on the unique sections extension.
I have been trying to use this extension to solve a problem in LLVM when
symbols with incompatible sh_entsizes are explicitly assigned to the same
section name, see https://bugs.llvm.org/show_bug.cgi?id=43457 and
https://reviews.llvm.org/D72194.
My question is why not simply use a new section each time we encounter a
section directive with a new set of section attributes. Using the example code
from comment 1.
cat > a.s <<e
.section .text,"ax",%progbits
.byte 0
.section .text,"awx",%progbits
.byte 1
.section .text,"awx",%progbits
.byte 2
e
GAS emits:
% as a.s
a.s: Assembler messages:
a.s:3: Warning: ignoring changed section attributes for .text
a.s:5: Warning: ignoring changed section attributes for .text
But, why not create 2 .text sections instead? This would simplify the fix for
https://bugs.llvm.org/show_bug.cgi?id=43457. It is also more intuitive
behaviour (IMO).
(In reply to Ben Dunbobbin from comment #10)
> Hi H.J. Lu and Fangrui Song,
>
> Thanks for your work on the unique sections extension.
>
> I have been trying to use this extension to solve a problem in LLVM when
> symbols with incompatible sh_entsizes are explicitly assigned to the same
> section name, see https://bugs.llvm.org/show_bug.cgi?id=43457 and
> https://reviews.llvm.org/D72194.
>
> My question is why not simply use a new section each time we encounter a
> section directive with a new set of section attributes. Using the example
> code from comment 1.
>
> cat > a.s <<e
> .section .text,"ax",%progbits
> .byte 0
> .section .text,"awx",%progbits
> .byte 1
> .section .text,"awx",%progbits
> .byte 2
> e
>
> GAS emits:
>
> % as a.s
> a.s: Assembler messages:
> a.s:3: Warning: ignoring changed section attributes for .text
> a.s:5: Warning: ignoring changed section attributes for .text
>
My comments are specific to .text. .text is a special section and
must be "ax":
http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections
(In reply to H.J. Lu from comment #11)
> (In reply to Ben Dunbobbin from comment #10)
> > Hi H.J. Lu and Fangrui Song,
> >
> My comments are specific to .text. .text is a special section and
> must be "ax":
>
> http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections
Right, sorry shouldn't have chosen a default section, bad example.
New example:
--- asm ----
.section .bob,"ax",%progbits
.byte 0
.section .bob,"awx",%progbits
.byte 1
--- disassembly ---
.bob:
add BYTE PTR [rcx],al
--- GAS output ----
./example.asm: Assembler messages:
./example.asm:3: Warning: ignoring changed section attributes for .bob
Compiler returned: 0
So why not create 2 sections instead here?
(In reply to Ben Dunbobbin from comment #12)
> (In reply to H.J. Lu from comment #11)
> > (In reply to Ben Dunbobbin from comment #10)
> > > Hi H.J. Lu and Fangrui Song,
> > >
> > My comments are specific to .text. .text is a special section and
> > must be "ax":
> >
> > http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections
>
> Right, sorry shouldn't have chosen a default section, bad example.
>
> New example:
>
> --- asm ----
> .section .bob,"ax",%progbits
> .byte 0
> .section .bob,"awx",%progbits
> .byte 1
>
> --- disassembly ---
> .bob:
> add BYTE PTR [rcx],al
>
> --- GAS output ----
> ./example.asm: Assembler messages:
> ./example.asm:3: Warning: ignoring changed section attributes for .bob
> Compiler returned: 0
>
> So why not create 2 sections instead here?
GNU BFD linker places the input sections of the same section name
into the same output section of the same name. I made some changes
a few years back in this area. I will see what I can do. But I need
to get this patch of mine:
https://sourceware.org/ml/binutils/2020-02/msg00038.html
into GNU binutils first.
(In reply to H.J. Lu from comment #13)
> (In reply to Ben Dunbobbin from comment #12)
> > (In reply to H.J. Lu from comment #11)
> > > (In reply to Ben Dunbobbin from comment #10)
> > > > Hi H.J. Lu and Fangrui Song,
>
> GNU BFD linker places the input sections of the same section name
> into the same output section of the same name...
Thanks! So, it's to protect the user... but it's not that useful. If I split
the example up then I don't get a warning:
> --- a.s ----
> .section .bob,"ax",%progbits
> .byte 0
> --- b.s ----
> .section .bob,"awx",%progbits
> .byte 1
So, I think even with the current behavior of GCC it makes sense to generate
multiple sections. I think that the linker is the best place to emit a
compatibility warning when combining sections. However, it would be possible to
emit an assembler warning that multiple output sections have been created with
the same name and the linker may combine these.
Maybe the assembler should not emit multiple sections with the same name under
any circumstances, even for COMDATs or SHF_LINK_ORDER, then it is up to the
user to choose a unique name. That is probably the easiest behavior for users
to understand.
> ... I made some changes
> a few years back in this area. I will see what I can do. But I need
> to get this patch of mine:
>
> https://sourceware.org/ml/binutils/2020-02/msg00038.html
>
> into GNU binutils first.
Of course. Thanks for taking the time to reply. I'm not sure there is anything
to do yet. I just wanted to discuss the design of the assembler with an expert.