Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.85k stars 527 forks source link

perl fails t/re/reg_mesg.t with -Uusedl #21558

Closed Leont closed 3 weeks ago

Leont commented 7 months ago

When compiling with -Uusedl, t/re/reg_mesg.t fails with the following errors.

not ok 2692 - ... and gave expected number (1) of warnings
# Failed test 2692 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten:
#   Variable length positive lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?<=(p|qq|rrr)) <-- HERE / at
ok 2693 -  /(?<!(p|qq|rrr))/ did not die
not ok 2694 - ... and gave expected number (1) of warnings
# Failed test 2694 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten: 
#   Variable length negative lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?<!(p|qq|rrr)) <-- HERE / at
ok 2695 -  /(?| (?=(foo)) | (?<=(foo)|p) )/ did not die
not ok 2696 - ... and gave expected number (1) of warnings
# Failed test 2696 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten:
#   Variable length positive lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?| (?=(foo)) | (?<=(foo)|p) ) <-- HERE / at  
ok 2697 -  /(?| (?=(foo)) | (?<=(foo)|p) )/x did not die
not ok 2698 - ... and gave expected number (1) of warnings
# Failed test 2698 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten: 
#   Variable length positive lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?| (?=(foo)) | (?<=(foo)|p) ) <-- HERE / at  
ok 2699 -  /(?| (?=(foo)) | (?<!(foo)|p) )/ did not die
not ok 2700 - ... and gave expected number (1) of warnings
# Failed test 2700 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten:
#   Variable length negative lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?| (?=(foo)) | (?<!(foo)|p) ) <-- HERE / at  
ok 2701 -  /(?| (?=(foo)) | (?<!(foo)|p) )/x did not die
not ok 2702 - ... and gave expected number (1) of warnings
# Failed test 2702 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten:
#   Variable length negative lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?| (?=(foo)) | (?<!(foo)|p) ) <-- HERE / at  
ok 2703 -  /(?<!(foo|bop(*ACCEPT)|bar)baz)/ did not die
not ok 2704 - ... and gave expected number (1) of warnings
# Failed test 2704 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten: 
#   Variable length negative lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?<!(foo|bop(*ACCEPT)|bar)baz) <-- HERE / at
ok 2705 -  /(?<=(foo|bop(*ACCEPT)|bar)baz)/ did not die
not ok 2706 - ... and gave expected number (1) of warnings
# Failed test 2706 - ... and gave expected number (1) of warnings at re/reg_mesg.t line 903
#      got "0"
# expected "1"
# Expected warnings not gotten:
#   Variable length positive lookbehind with capturing is experimental in regex; marked by <-- HERE in m/(?<=(foo|bop(*ACCEPT)|bar)baz) <-- HERE / at
Leont commented 7 months ago

My first guess here (based on previous problems) is that the code is expecting a local symbol to override a core symbol, but that doesn't work with static linking.

jkeenan commented 7 months ago

I tried to verify your results by compiling blead on FreeBSD-13 with this configuration:

$ sh ./Configure -des -Dusedevel -Duseithreads -Uusedl && make test_prep

I then called ./perl -Ilib t/re/reg_mesg.t. When I got down to about test 3255, the program spewed lots of non-ASCII characters over the terminal. The program apparently terminated but the screen prompt was no longer visible, though I could clear the screen with Ctrl-L.

I then ran make test_harness. t/re/reg_mesg.t PASSed without incident. However, I noticed a warning being emitted during t/op/sub.t, which I could reproduce as follows:

$ cd t;./perl harness -v op/sub.t; cd -

ok 1 - Is empty
ok 2 - Is still empty
ok 3 - Didnt return anything
ok 4 - Didnt return anything
ok 5 - result of delete(helem) is copied when returned
ok 6 - result of delete(helem) is copied when explicitly returned
ok 7 - result of delete(aelem) is copied when returned
ok 8 - result of delete(aelem) is copied when explicitly returned
ok 9 - result of shift is copied when returned
ok 10 - result of shift is copied when explicitly returned
ok 11 - result of delete(helem) is copied: practical test
ok 12 - sub redefinition sets CvGV
ok 13 - no double free redefining anon stub
ok 14 - recursive calls do not share shared-hash-key TARGs
ok 15 - recursive calls do not share shared-hash-key TARGs (2)
ok 16 - [perl \#78194] \$_[0] == \$_[0] when @_ aliases "$x"
ok 17 - sub (){42} returns a mutable value
ok 18 - sub (){ return 42 } returns a mutable value
ok 19 - my sub (){42} returns a mutable value
ok 20 - my sub (){ return 42 } returns a mutable value
ok 21 - freeing ops does not make sub(){42} immutable
ok 22 - num of elems in @_ after &xsub with nonexistent $_[0]
ok 23 - content of nonexistent $_[0] is modified by &xsub
ok 24 - goto &xsub when @_ does not exist
ok 25 - re.pm not loaded yet
ok 26 - XSUB clobbering sub whose DESTROY assigns to the glob
ok 27 - Pure-Perl sub clobbering sub whose DESTROY assigns to the glob
Subroutine re::regmust redefined at ../lib/re.pm line 95.
ok 28 - check special blocks are cleared on error
ok 29 - stub re-declaration of constant with no prototype
...

This warning does not appear to be emitted in builds without -Uusedl.

I know little about this configuration option, so I'm simply reporting what I see here.

demerphq commented 7 months ago

Thanks Leon and James. I'll try to find some time for this over the weekend

Yves

On Fri, 13 Oct 2023, 00:27 James E Keenan, @.***> wrote:

I tried to verify your results by compiling blead on FreeBSD-13 with this configuration:

$ sh ./Configure -des -Dusedevel -Duseithreads -Uusedl && make test_prep

I then called ./perl -Ilib t/re/reg_mesg.t. When I got down to about test 3255, the program spewed lots of non-ASCII characters over the terminal. The program apparently terminated but the screen prompt was no longer visible, though I could clear the screen with Ctrl-L.

I then ran make test_harness. t/re/reg_mesg.t PASSed without incident. However, I noticed a warning being emitted during t/op/sub.t, which I could reproduce as follows:

$ cd t;./perl harness -v op/sub.t; cd -

ok 1 - Is empty ok 2 - Is still empty ok 3 - Didnt return anything ok 4 - Didnt return anything ok 5 - result of delete(helem) is copied when returned ok 6 - result of delete(helem) is copied when explicitly returned ok 7 - result of delete(aelem) is copied when returned ok 8 - result of delete(aelem) is copied when explicitly returned ok 9 - result of shift is copied when returned ok 10 - result of shift is copied when explicitly returned ok 11 - result of delete(helem) is copied: practical test ok 12 - sub redefinition sets CvGV ok 13 - no double free redefining anon stub ok 14 - recursive calls do not share shared-hash-key TARGs ok 15 - recursive calls do not share shared-hash-key TARGs (2) ok 16 - [perl #78194] \$[0] == \$[0] when @ aliases "$x" ok 17 - sub (){42} returns a mutable value ok 18 - sub (){ return 42 } returns a mutable value ok 19 - my sub (){42} returns a mutable value ok 20 - my sub (){ return 42 } returns a mutable value ok 21 - freeing ops does not make sub(){42} immutable ok 22 - num of elems in @ after &xsub with nonexistent $[0] ok 23 - content of nonexistent $[0] is modified by &xsub ok 24 - goto &xsub when @_ does not exist ok 25 - re.pm not loaded yet ok 26 - XSUB clobbering sub whose DESTROY assigns to the glob ok 27 - Pure-Perl sub clobbering sub whose DESTROY assigns to the glob Subroutine re::regmust redefined at ../lib/re.pm line 95. ok 28 - check special blocks are cleared on error ok 29 - stub re-declaration of constant with no prototype ...

This warning does not appear to be emitted in builds without -Uusedl.

I know little about this configuration option, so I'm simply reporting what I see here.

— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/21558#issuecomment-1760456262, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R5YFJRKYXJIB63L3WDX7BVDPANCNFSM6AAAAAA56HXFLI . You are receiving this because you were assigned.Message ID: @.***>

tonycoz commented 3 months ago

The problem is with a static build some of the objects linked in see a definition of RExC_state_t with DEBUGGING enabled, while others see it with DEBUGGING not defined, and that structure has members only present when DEBUGGING is enabled.

Building a static re (or -Uusedl) perl with ASAN produces a perl that fails ASAN fairly quickly when trying to access beyond the end of the RExC_state object on the stack.

Removing the DEBUGGING conditional fixes the immediate problem and allows the expected warnings for my test cases.

I suspect the real fix is to extend the work done by ext/re/re_top.h to the compilation names, but some feedback from someone with more knowledge of the regexp engine would be useful.

demerphq commented 3 months ago

On Wed, 14 Feb 2024, 12:24 Tony Cook, @.***> wrote:

The problem is with a static build some of the objects linked in see a definition of RExC_state_t with DEBUGGING enabled, while others see it with DEBUGGING not defined, and that structure has members only present when DEBUGGING is enabled.

Building a static re (or -Uusedl) perl with ASAN produces a perl that fails ASAN fairly quickly when trying to access beyond the end of the RExC_state object on the stack.

Removing the DEBUGGING conditional fixes the immediate problem and allows the expected warnings for my test cases.

I suspect the real fix is to extend the work done by ext/re/re_top.h to the compilation names, but some feedback from someone with more knowledge of the regexp engine would be useful.

I'll try to take a look later today.

Yves

jkeenan commented 3 months ago

So far this ticket has discussed test failures. But when, in preparing to bisect, I tried building with -Uusedl (on Linux) at tags correponding to older production releases, my results were all over the map.

5.32.0 and 5.36.0: Able to build (but with lots of build-time warnings); t/re/reg_mesg.t passes.

5.36.0: Able to build, test passes but spews as described above.

5.38.0: Unable to build; compilation fails at:

regcomp_invlist.c:(.text+0x1360): multiple definition of `Perl_populate_invlist_from_bitmap'; lib/auto/re/re.a(re_comp_invlist.o):re_comp_invlist.c:(.text+0x310): first defined here
collect2: error: ld returned 1 exit status
make: *** [makefile:392: perl] Error 1

5.39.1: Same as 5.38.0.

5.39.2: Once again able to build, but t/re/reg_mesg.t fails as described by @leont when run through harness (non-verbosely).

I bisected the resumption of successful compilation with the following invocation:

perl Porting/bisect.pl \
-Uusedl \
--test-build \
--expect-fail \
--start=v5.39.1 \
--end=v5.39.2

... and the results pointed (as I would have expected) to this commit:

commit ba6e2c38aafc23cf114f3ba0d0ff3baead34328b
Author:     Yves Orton <demerphq@gmail.com>
AuthorDate: Tue Aug 1 23:12:46 2023 +0200
Commit:     Yves Orton <demerphq@gmail.com>
CommitDate: Thu Aug 3 15:25:02 2023 +0200

    regcomp*.c, regexec.c - fixup regex engine build under -Uusedl

I next bisected for the beginning of failed compilation:

perl Porting/bisect.pl \
-Uusedl \
--test-build \
--start=v5.36.0 \
--end=v5.38.0

Bisection pointed to this commit:

commit 85900e28cc250e1c4603f11073b77d0c6b5cff46
Author:     Yves Orton <demerphq@gmail.com>
AuthorDate: Fri Dec 9 11:00:17 2022 +0100
Commit:     Yves Orton <demerphq@gmail.com>
CommitDate: Fri Dec 9 16:19:29 2022 +0100

    regcomp.c - decompose into smaller files

One inference: -Uusedl is a very sensitive configuration option!

demerphq commented 3 months ago

On Wed, 14 Feb 2024 at 14:25, James E Keenan @.***> wrote:

So far this ticket has discussed test failures. But when, in preparing to bisect, I tried building with -Uusedl (on Linux) at tags correponding to older production releases, my results were all over the map.

5.32.0 and 5.36.0: Able to build (but with lots of build-time warnings); t/re/reg_mesg.t passes.

5.36.0: Able to build, test passes but spews as described above.

5.38.0: Unable to build; compilation fails at:

regcomp_invlist.c:(.text+0x1360): multiple definition of `Perl_populate_invlist_from_bitmap'; lib/auto/re/re.a(re_comp_invlist.o):re_comp_invlist.c:(.text+0x310): first defined here collect2: error: ld returned 1 exit status make: *** [makefile:392: perl] Error 1

5.39.1: Same as 5.38.0.

5.39.2: Once again able to build, but t/re/reg_mesg.t fails as described by @Leont https://github.com/Leont when run through harness (non-verbosely).

I bisected the resumption of successful compilation with the following invocation:

perl Porting/bisect.pl \ -Uusedl \ --test-build \ --expect-fail \ --start=v5.39.1 \ --end=v5.39.2

... and the results pointed (as I would have expected) to this commit:

commit ba6e2c38aafc23cf114f3ba0d0ff3baead34328b Author: Yves Orton @.> AuthorDate: Tue Aug 1 23:12:46 2023 +0200 Commit: Yves Orton @.> CommitDate: Thu Aug 3 15:25:02 2023 +0200

regcomp*.c, regexec.c - fixup regex engine build under -Uusedl

I next bisected for the beginning of failed compilation:

perl Porting/bisect.pl \ -Uusedl \ --test-build \ --start=v5.36.0 \ --end=v5.38.0

Bisection pointed to this commit:

commit 85900e28cc250e1c4603f11073b77d0c6b5cff46 Author: Yves Orton @.> AuthorDate: Fri Dec 9 11:00:17 2022 +0100 Commit: Yves Orton @.> CommitDate: Fri Dec 9 16:19:29 2022 +0100

regcomp.c - decompose into smaller files

One inference: -Uusedl is a very sensitive configuration option!

Yes. since we dont build with it normally it is not that difficult, especially in the regex engine, to make a change that would break it. "-Uusedl" means to statically link our libraries, which means that symbol collisions become possible that wouldnt normally occur when using dynamic linking.

When I reorganized the code in the regex engine I unknowingly broke this build option. :-(

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq commented 3 months ago

On Wed, 14 Feb 2024 at 05:24, Tony Cook @.***> wrote:

The problem is with a static build some of the objects linked in see a definition of RExC_state_t with DEBUGGING enabled, while others see it with DEBUGGING not defined, and that structure has members only present when DEBUGGING is enabled.

Building a static re (or -Uusedl) perl with ASAN produces a perl that fails ASAN fairly quickly when trying to access beyond the end of the RExC_state object on the stack.

Removing the DEBUGGING conditional fixes the immediate problem and allows the expected warnings for my test cases.

I suspect the real fix is to extend the work done by ext/re/re_top.h to the compilation names, but some feedback from someone with more knowledge of the regexp engine would be useful.

I pushed this:

https://github.com/Perl/perl5/pull/21993

There still seem to be leakage related to the locale code, but reg_mesg.t passed test for me.

I took the liberty of pushing it without testing it under a normal build mode, I only tested it under -Uusedl, so it is possible i messed something up, if so ill look into it later.

Cheers, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

haarg commented 3 weeks ago

This issue seems to be fixed by #21883. Is it closable?

demerphq commented 3 weeks ago

@haarg, wasnt this issue fixed with #21993? Did you typo that?

And yes i think this can be closed.

jkeenan commented 3 weeks ago

I built with -Uusedl, unthreaded on Linux, threaded on FreeBSD, then ran make test_harness. PASS in both cases. Closing.