Perl / perl5

šŸŖ The Perl programming language
https://dev.perl.org/perl5/
Other
1.97k stars 555 forks source link

[PATCH] prevent multiple evaluations of ERRSV #12608

Closed p5pRT closed 11 years ago

p5pRT commented 11 years ago

Migrated from rt.perl.org#115884 (status was 'resolved')

Searchable as RT115884$

p5pRT commented 11 years ago

From @bulk88

See attached patch.

p5pRT commented 11 years ago

From @bulk88

0001-prevent-multiple-evaluations-of-ERRSV.patch ```diff From 491744cbba5b9e5c58852548bdd50376ad1854eb Mon Sep 17 00:00:00 2001 From: Daniel Dragan Date: Thu, 22 Nov 2012 23:37:29 -0500 Subject: [PATCH] prevent multiple evaluations of ERRSV Remove a large amount of machine code (~4KB for me) from funcs that use ERRSV making Perl faster and smaller by preventing multiple evaluation. ERRSV is a macro that contains GvSVn which eventually conditionally calls Perl_gv_add_by_type. If a SvTRUE or any other multiple evaluation macro is used on ERRSV, the expansion will, in asm have dozens of calls to Perl_gv_add_by_type one for each test/deref of the SV in SvTRUE. A less severe problem exists when multiple funcs (sv_set*) in a row call, each with ERRSV as an arg. Its recalculated then, Perl_gv_add_by_type and all. I think ERRSV macro got the func call in commit f5fa9033b8, Perl RT #70862. Prior to that commit it would be pure derefs I think. Saving the SV* is still better than looking into interp->gv->gp to get the SV * after each func call. I received no responses to http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195724.html explaining when the SV is replaced in PL_errgv, so took a conservative view and assumed callbacks (with Perl stack/ENTER/LEAVE/eval_*/call_*) can change it. I also assume ERRSV will never return null, this allows a more efficiently version of SvTRUE to be used. In Perl_newATTRSUB_flags a wasteful copy to C stack operation with the string was removed, and a croak_notcontext to remove push instructions to the stack. I was not sure about the interaction between ERRSV and message sv, I didn't change it to a more efficient (instruction wise, speed, idk) format string combining of the not safe string and ERRSV in the croak call. If such an optimization is done, a compiler potentially will put the not safe string on the first, unconditionally, then check PL_in_eval, and then jump to the croak call site, or eval ERRSV, push the SV on the C stack then push the format string "%"SVf"%s". The C stack allocated const char array came from commit e1ec3a884f . In Perl_eval_pv, croak_on_error was checked first to not eval ERRSV unless necessery. I was not sure about the side effects of using a more efficient croak_sv instead of Perl_croak (null chars, utf8, etc) so I left a comment. nocontext used to save an push instruction on implicit sys perl. In S_doeval, don't open a new block to avoid large whitespace changes. The NULL assignment should optimize away unless accidental usage of errsv in the future happens through a code change. There might be a bug here from commit ecad31f018 since previous a char * was derefed to check for null char, but ERRSV will never be null, so "Unknown error\n" branch will never be taken. For pp_sys.c, in pp_die a new block was opened to not eval ERRSV if "well-formed exception supplied". The else if else if else blocks all used ERRSV, so a "SV * errsv = NULL;" and a eval in the conditional with comma op thing wouldn't work (maybe it would, see toke.c comments later in this message). pp_warn, I have no comments. In S_compile_runtime_code, a croak_sv question comes up same as in Perl_eval_pv. In S_new_constant, a eval in the conditional is done to avoid evaling ERRSV if PL_in_eval short circuits. Same thing in Perl_yyerror_pvn. Perl__core_swash_init I have no comments. In the future, a SvEMPTYSTRING macro should be considered (not fully thought out by me) to replace the SvTRUEs with something smaller and faster when dealing with ERRSV. _nomg is another thing to think about. In S_init_main_stash there is an opportunity to prevent an extra ERRSV between "sv_grow(ERRSV, 240);" and "CLEAR_ERRSV();" that was too complicated for me to optimize. before perl517.dll .text 0xc2f77 .rdata 0x212dc .data 0x3948 after perl517.dll .text 0xc20d7 .rdata 0x212dc .data 0x3948 Numbers are from VC 2003 x86 32 bit. --- mg.c | 37 ++++++++++++++++++--------------- op.c | 9 +++---- perl.c | 8 +++++- pp_ctl.c | 16 +++++++++----- pp_sys.c | 66 ++++++++++++++++++++++++++++++++---------------------------- regcomp.c | 10 ++++++-- toke.c | 12 ++++++---- utf8.c | 14 +++++++++--- 8 files changed, 99 insertions(+), 73 deletions(-) diff --git a/mg.c b/mg.c index 761bf73..2d063db 100644 --- a/mg.c +++ b/mg.c @@ -3139,8 +3139,10 @@ Perl_sighandler(int sig) call_sv(MUTABLE_SV(cv), G_DISCARD|G_EVAL); POPSTACK; - if (SvTRUE(ERRSV)) { - SvREFCNT_dec(errsv_save); + { + SV * const errsv = ERRSV; + if (SvTRUE_NN(errsv)) { + SvREFCNT_dec(errsv_save); #ifndef PERL_MICRO /* Handler "died", for example to get out of a restart-able read(). * Before we re-do that on its behalf re-enable the signal which was @@ -3148,25 +3150,26 @@ Perl_sighandler(int sig) */ #ifdef HAS_SIGPROCMASK #if defined(HAS_SIGACTION) && defined(SA_SIGINFO) - if (sip || uap) + if (sip || uap) #endif - { - sigset_t set; - sigemptyset(&set); - sigaddset(&set,sig); - sigprocmask(SIG_UNBLOCK, &set, NULL); - } + { + sigset_t set; + sigemptyset(&set); + sigaddset(&set,sig); + sigprocmask(SIG_UNBLOCK, &set, NULL); + } #else - /* Not clear if this will work */ - (void)rsignal(sig, SIG_IGN); - (void)rsignal(sig, PL_csighandlerp); + /* Not clear if this will work */ + (void)rsignal(sig, SIG_IGN); + (void)rsignal(sig, PL_csighandlerp); #endif #endif /* !PERL_MICRO */ - die_sv(ERRSV); - } - else { - sv_setsv(ERRSV, errsv_save); - SvREFCNT_dec(errsv_save); + die_sv(errsv); + } + else { + sv_setsv(errsv, errsv_save); + SvREFCNT_dec(errsv_save); + } } cleanup: diff --git a/op.c b/op.c index 23f7aff..1b4cf8d 100644 --- a/op.c +++ b/op.c @@ -7379,14 +7379,13 @@ Perl_newATTRSUB_flags(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs, const char *s = strrchr(name, ':'); s = s ? s+1 : name; if (strEQ(s, "BEGIN")) { - const char not_safe[] = - "BEGIN not safe after errors--compilation aborted"; if (PL_in_eval & EVAL_KEEPERR) - Perl_croak(aTHX_ not_safe); + Perl_croak_nocontext("BEGIN not safe after errors--compilation aborted"); else { + SV * const errsv = ERRSV; /* force display of errors found but not reported */ - sv_catpv(ERRSV, not_safe); - Perl_croak(aTHX_ "%"SVf, SVfARG(ERRSV)); + sv_catpvs(errsv, "BEGIN not safe after errors--compilation aborted"); + Perl_croak_nocontext("%"SVf, SVfARG(errsv)); } } } diff --git a/perl.c b/perl.c index 63de43e..bd7a6e5 100644 --- a/perl.c +++ b/perl.c @@ -2904,8 +2904,12 @@ Perl_eval_pv(pTHX_ const char *p, I32 croak_on_error) sv = POPs; PUTBACK; - if (croak_on_error && SvTRUE(ERRSV)) { - Perl_croak(aTHX_ "%s", SvPVx_nolen_const(ERRSV)); + /* just check empty string or undef? */ + if (croak_on_error) { + SV * const errsv = ERRSV; + if(SvTRUE_NN(errsv)) + /* replace with croak_sv? */ + Perl_croak_nocontext("%s", SvPV_nolen_const(errsv)); } return sv; diff --git a/pp_ctl.c b/pp_ctl.c index 24eac16..c9e4ac4 100644 --- a/pp_ctl.c +++ b/pp_ctl.c @@ -3445,6 +3445,7 @@ S_doeval(pTHX_ int gimme, CV* outside, U32 seq, HV *hh) PERL_CONTEXT *cx; I32 optype; /* Used by POPEVAL. */ SV *namesv; + SV *errsv = NULL; cx = NULL; namesv = NULL; @@ -3467,6 +3468,7 @@ S_doeval(pTHX_ int gimme, CV* outside, U32 seq, HV *hh) LEAVE_with_name("eval"); /* pp_entereval knows about this LEAVE. */ } + errsv = ERRSV; if (in_require) { if (!cx) { /* If cx is still NULL, it means that we didn't go in the @@ -3480,13 +3482,13 @@ S_doeval(pTHX_ int gimme, CV* outside, U32 seq, HV *hh) SvUTF8(namesv) ? -(I32)SvCUR(namesv) : (I32)SvCUR(namesv), &PL_sv_undef, 0); Perl_croak(aTHX_ "%"SVf"Compilation failed in require", - SVfARG(ERRSV - ? ERRSV + SVfARG(errsv + ? errsv : newSVpvs_flags("Unknown error\n", SVs_TEMP))); } else { - if (!*(SvPVx_nolen_const(ERRSV))) { - sv_setpvs(ERRSV, "Compilation error"); + if (!*(SvPV_nolen_const(errsv))) { + sv_setpvs(errsv, "Compilation error"); } } if (gimme != G_ARRAY) PUSHs(&PL_sv_undef); @@ -5367,8 +5369,10 @@ S_run_user_filter(pTHX_ int idx, SV *buf_sv, int maxlen) if (SvOK(out)) { status = SvIV(out); } - else if (SvTRUE(ERRSV)) { - err = newSVsv(ERRSV); + else { + SV * const errsv = ERRSV; + if (SvTRUE_NN(errsv)) + err = newSVsv(errsv); } } diff --git a/pp_sys.c b/pp_sys.c index 5945e23..06699d9 100644 --- a/pp_sys.c +++ b/pp_sys.c @@ -445,17 +445,18 @@ PP(pp_warn) /* well-formed exception supplied */ } else { - SvGETMAGIC(ERRSV); - if (SvROK(ERRSV)) { - if (SvGMAGICAL(ERRSV)) { + SV * const errsv = ERRSV; + SvGETMAGIC(errsv); + if (SvROK(errsv)) { + if (SvGMAGICAL(errsv)) { exsv = sv_newmortal(); - sv_setsv_nomg(exsv, ERRSV); + sv_setsv_nomg(exsv, errsv); } - else exsv = ERRSV; + else exsv = errsv; } - else if (SvPOKp(ERRSV) ? SvCUR(ERRSV) : SvNIOKp(ERRSV)) { + else if (SvPOKp(errsv) ? SvCUR(errsv) : SvNIOKp(errsv)) { exsv = sv_newmortal(); - sv_setsv_nomg(exsv, ERRSV); + sv_setsv_nomg(exsv, errsv); sv_catpvs(exsv, "\t...caught"); } else { @@ -489,32 +490,35 @@ PP(pp_die) if (SvROK(exsv) || (SvPV_const(exsv, len), len)) { /* well-formed exception supplied */ } - else if (SvROK(ERRSV)) { - exsv = ERRSV; - if (sv_isobject(exsv)) { - HV * const stash = SvSTASH(SvRV(exsv)); - GV * const gv = gv_fetchmethod(stash, "PROPAGATE"); - if (gv) { - SV * const file = sv_2mortal(newSVpv(CopFILE(PL_curcop),0)); - SV * const line = sv_2mortal(newSVuv(CopLINE(PL_curcop))); - EXTEND(SP, 3); - PUSHMARK(SP); - PUSHs(exsv); - PUSHs(file); - PUSHs(line); - PUTBACK; - call_sv(MUTABLE_SV(GvCV(gv)), - G_SCALAR|G_EVAL|G_KEEPERR); - exsv = sv_mortalcopy(*PL_stack_sp--); + else { + SV * const errsv = ERRSV; + if (SvROK(errsv)) { + exsv = errsv; + if (sv_isobject(exsv)) { + HV * const stash = SvSTASH(SvRV(exsv)); + GV * const gv = gv_fetchmethod(stash, "PROPAGATE"); + if (gv) { + SV * const file = sv_2mortal(newSVpv(CopFILE(PL_curcop),0)); + SV * const line = sv_2mortal(newSVuv(CopLINE(PL_curcop))); + EXTEND(SP, 3); + PUSHMARK(SP); + PUSHs(exsv); + PUSHs(file); + PUSHs(line); + PUTBACK; + call_sv(MUTABLE_SV(GvCV(gv)), + G_SCALAR|G_EVAL|G_KEEPERR); + exsv = sv_mortalcopy(*PL_stack_sp--); + } } } - } - else if (SvPV_const(ERRSV, len), len) { - exsv = sv_mortalcopy(ERRSV); - sv_catpvs(exsv, "\t...propagated"); - } - else { - exsv = newSVpvs_flags("Died", SVs_TEMP); + else if (SvPV_const(errsv, len), len) { + exsv = sv_mortalcopy(errsv); + sv_catpvs(exsv, "\t...propagated"); + } + else { + exsv = newSVpvs_flags("Died", SVs_TEMP); + } } return die_sv(exsv); } diff --git a/regcomp.c b/regcomp.c index 8b7c84c..24186e0 100644 --- a/regcomp.c +++ b/regcomp.c @@ -5095,10 +5095,14 @@ S_compile_runtime_code(pTHX_ RExC_state_t * const pRExC_state, SPAGAIN; qr_ref = POPs; PUTBACK; - if (SvTRUE(ERRSV)) { - Safefree(pRExC_state->code_blocks); - Perl_croak(aTHX_ "%s", SvPVx_nolen_const(ERRSV)); + SV * const errsv = ERRSV; + if (SvTRUE_NN(errsv)) + { + Safefree(pRExC_state->code_blocks); + /* use croak_sv ? */ + Perl_croak_nocontext("%s", SvPV_nolen_const(errsv)); + } } assert(SvROK(qr_ref)); qr = SvRV(qr_ref); diff --git a/toke.c b/toke.c index a7c9ca5..902f83c 100644 --- a/toke.c +++ b/toke.c @@ -9019,6 +9019,7 @@ S_new_constant(pTHX_ const char *s, STRLEN len, const char *key, STRLEN keylen, dVAR; dSP; HV * table = GvHV(PL_hintgv); /* ^H */ SV *res; + SV *errsv = NULL; SV **cvp; SV *cv, *typesv; const char *why1 = "", *why2 = "", *why3 = ""; @@ -9112,11 +9113,11 @@ now_ok: SPAGAIN ; /* Check the eval first */ - if (!PL_in_eval && SvTRUE(ERRSV)) { + if (!PL_in_eval && ((errsv = ERRSV), SvTRUE_NN(errsv))) { STRLEN errlen; const char * errstr; - sv_catpvs(ERRSV, "Propagated"); - errstr = SvPV_const(ERRSV, errlen); + sv_catpvs(errsv, "Propagated"); + errstr = SvPV_const(errsv, errlen); yyerror_pvn(errstr, errlen, 0); /* Duplicates the message inside eval */ (void)POPs; res = SvREFCNT_inc_simple(sv); @@ -11264,9 +11265,10 @@ Perl_yyerror_pvn(pTHX_ const char *const s, STRLEN len, U32 flags) else qerror(msg); if (PL_error_count >= 10) { - if (PL_in_eval && SvCUR(ERRSV)) + SV * errsv; + if (PL_in_eval && ((errsv = ERRSV), SvCUR(errsv))) Perl_croak(aTHX_ "%"SVf"%s has too many errors.\n", - SVfARG(ERRSV), OutCopFILE(PL_curcop)); + SVfARG(errsv), OutCopFILE(PL_curcop)); else Perl_croak(aTHX_ "%s has too many errors.\n", OutCopFILE(PL_curcop)); diff --git a/utf8.c b/utf8.c index 5621317..b380cd2 100644 --- a/utf8.c +++ b/utf8.c @@ -2863,8 +2863,11 @@ Perl__core_swash_init(pTHX_ const char* pkg, const char* name, SV *listsv, I32 m #endif Perl_load_module(aTHX_ PERL_LOADMOD_NOIMPORT, newSVpvn(pkg,pkg_len), NULL); - if (!SvTRUE(ERRSV)) - sv_setsv(ERRSV, errsv_save); + { + SV * const errsv = ERRSV; + if (!SvTRUE_NN(errsv)) + sv_setsv(errsv, errsv_save); + } LEAVE; } SPAGAIN; @@ -2887,8 +2890,11 @@ Perl__core_swash_init(pTHX_ const char* pkg, const char* name, SV *listsv, I32 m retval = *PL_stack_sp--; SvREFCNT_inc(retval); } - if (!SvTRUE(ERRSV)) - sv_setsv(ERRSV, errsv_save); + { + SV * const errsv = ERRSV; + if (!SvTRUE_NN(errsv)) + sv_setsv(errsv, errsv_save); + } LEAVE; POPSTACK; if (IN_PERL_COMPILETIME) { -- 1.7.9.msysgit.0 ```
p5pRT commented 11 years ago

From @bulk88

On Thu Nov 22 20​:48​:59 2012\, bulk88 wrote​:

See attached patch.

No response. Ill close it myself.

applied as eed484f95050ad51c720521f68c6341a14bf5638

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

@bulk88 - Status changed from 'new' to 'resolved'

p5pRT commented 11 years ago

From @cpansprout

On Thu Nov 22 20​:48​:59 2012\, bulk88 wrote​:

See attached patch.

Thank you. Applied as eed484f95.

I received no responses to http​://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195724.html explaining when the SV is replaced in PL_errgv\,

I didnā€™t answer because I didnā€™t know whether there are things to worry about other than the obvious $@​=... and *@​=... assignment that can occur when calling perl code.

so took a conservative view and assumed callbacks (with Perl stack/ENTER/LEAVE/eval_*/call_*) can change it.

Thatā€™s what I would do\, too.

I also assume ERRSV will never return null\, this allows a more efficiently version of SvTRUE to be used.

In Perl_newATTRSUB_flags a wasteful copy to C stack operation with the string was removed\, and a croak_notcontext to remove push instructions to the stack. I was not sure about the interaction between ERRSV and message sv\, I didn't change it to a more efficient (instruction wise\, speed\, idk) format string combining of the not safe string and ERRSV in the croak call. If such an optimization is done\, a compiler potentially will put the not safe string on the first\, unconditionally\, then check PL_in_eval\, and then jump to the croak call site\, or eval ERRSV\, push the SV on the C stack then push the format string "%"SVf"%s". The C stack allocated const char array came from commit e1ec3a884f .

In Perl_eval_pv\, croak_on_error was checked first to not eval ERRSV unless necessery. I was not sure about the side effects of using a more efficient croak_sv instead of Perl_croak (null chars\, utf8\, etc) so I left a comment.

Using croak_sv would probably actually fix bugs in those cases\, but it would need to be examined more carefully first. I mean I canā€™t tell you off the top of my head whether it would be a safe substitute.

In the future\, a SvEMPTYSTRING macro should be considered (not fully thought out by me) to replace the SvTRUEs with something smaller and faster when dealing with ERRSV.

$@​ can be set to something other than a string easily enough. Iā€™m not sure that would be feasible.

--

Father Chrysostomos

p5pRT commented 11 years ago

From @bulk88

On Fri Nov 23 17​:58​:59 2012\, sprout wrote​:

On Thu Nov 22 20​:48​:59 2012\, bulk88 wrote​:

I was not sure about the side effects of using a more efficient croak_sv instead of Perl_croak (null chars\, utf8\, etc) so I left a comment.

Using croak_sv would probably actually fix bugs in those cases\, but it would need to be examined more carefully first. I mean I canā€™t tell you off the top of my head whether it would be a safe substitute.

The change would probably be script level visible somehow. I guess suddenly blessed and magic and utf8 tieds and all kinds of other things could get through the eval to the caller\, and possibly die in the caller outside the eval when the caller reads $@​ or something.

In the future\, a SvEMPTYSTRING macro should be considered (not fully thought out by me) to replace the SvTRUEs with something smaller and faster when dealing with ERRSV.

$@​ can be set to something other than a string easily enough. Iā€™m not sure that would be feasible.

In some places SvTRUEs are done to check whether to restore a copy of the original ERRSV back to ERRSV or to execute "we had an exception" branches (eval_pv for example). The problem is\, the strict definition of "we had no exception" is POK\, CUR=0\, non NULL PV with pv[0] == '\0'\, in ERRSV\, according to perlvar. So technically undef\, and IV 0 and NV 0 and PV CUR 1\, p[0] == '0' are all "we had an exception" ("failed() if defined($@​) && $@​ ne '';") \, yet are treated as "we didnt have an exception". Is that a bug or a feature?

A SvEMPTYSTRING would just check POK and CUR==0 and be done (I'm not sure whether to call magic or not)\, not check IV and NV and string "0". -- bulk88 ~ bulk88 at hotmail.com