Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.93k stars 551 forks source link

SvUV() macro 100% of time calls Perl_sv_2uv_flags() because SvUOK_nog/Perl_sv_setuv/SVf_IVisUV #22653

Open bulk88 opened 8 hours ago

bulk88 commented 8 hours ago

Description I was stepping debugging some XS code, and noticed, if you create a SvUV, with sv_setuv(), then later use SvUV() macro or eqv inline function, %99.999 of the time, Perl_sv_2uv_flags() will execute.

This "SvUV" created with sv_setuv(), or newSVuv()

SV = IV(0x6d2de0) at 0x6d2df0
  REFCNT = 2
  FLAGS = (IOK,pIOK)
  IV = 5369402920

the IV is really a C pointer/opaque pointer from a C library. 100% of the time, SvUV() macro, will execute Perl_sv_2uv_flags(). Because of

SV *
Perl_newSVuv(pTHX_ const UV u)
{
    SV *sv;

    /* Inlining ONLY the small relevant subset of sv_setuv here
     * for performance. Makes a significant difference. */

    /* Using ivs is more efficient than using uvs - see sv_setuv */
    if (u <= (UV)IV_MAX) {
        return newSViv((IV)u);
    }

    new_SV(sv);

    /* We're starting from SVt_FIRST, so provided that's
     * actual 0, we don't have to unset any SV type flags
     * to promote to SVt_IV. */
    STATIC_ASSERT_STMT(SVt_FIRST == 0);

    SET_SVANY_FOR_BODYLESS_IV(sv);
    SvFLAGS(sv) |= SVt_IV;
    (void)SvIOK_on(sv);
    (void)SvIsUV_on(sv);

    SvUV_set(sv, u);
    SvTAINT(sv);

    return sv;
}

that

    if (u <= (UV)IV_MAX) {

turns off SVf_IVisUV in final SvUV, then the SvUOK_nog() in SvUV(), says "NOT AN INTEGER" and calls the HEAVY getter Perl_sv_2uv_flags() each time. Since these are pointers, I think on 99% OSes, highest bit in an address, or negative addresses, are kernel space, and prohibited for user mode. Win32 32b and maybe some Solaris'es use and alloc high bit/negative pointer addresses, but I dont think P5 is compatible with such CPUs/C compilers/build configs.

I traced

+#define SvUOK_nog(sv)      ((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV|SVs_GMG)) == (SVf_IOK|SVf_IVisUV))
@@ -1568,9 +1587,9 @@ Like sv_utf8_upgrade, but doesn't do magic on C<sv>.
 */

 /* Let us hope that bitmaps for UV and IV are the same */
-#define SvIV(sv) (SvIOK(sv) ? SvIVX(sv) : sv_2iv(sv))
-#define SvUV(sv) (SvIOK(sv) ? SvUVX(sv) : sv_2uv(sv))
-#define SvNV(sv) (SvNOK(sv) ? SvNVX(sv) : sv_2nv(sv))
+#define SvIV(sv) (SvIOK_nog(sv) ? SvIVX(sv) : sv_2iv(sv))
+#define SvUV(sv) (SvUOK_nog(sv) ? SvUVX(sv) : sv_2uv(sv))
+#define SvNV(sv) (SvNOK_nog(sv) ? SvNVX(sv) : sv_2nv(sv))

to

Revision: 4bac9ae47b5ad7845a24e26b0e95609805de688a Author: Chip Salzenberg chip@pobox.com Date: 6/22/2012 6:18:18 PM Message: Magic flags harmonization.

    /* Using ivs is more efficient than using uvs - see sv_setuv */
    if (u <= (UV)IV_MAX) {
        return newSViv((IV)u);
    }

traced to

Revision: 4d55d9ac67bf8bc58f3fc9563b082459c6a3c22b Author: Steffen Müller smueller@cpan.org Date: 11/28/2014 11:34:00 AM Message: Repeat newSViv optimization for newSVuv

and that quotes

Revision: bd30fe8921c88e4677c2279b442a56a11ae037b4 Author: Eric Herman eric@freesa.org Date: 11/28/2014 9:08:04 AM Message: Speed up newSViv()

see also

Revision: 013abb9bfbbb8091fa79597e81e7208ef1fe2dde Author: Nicholas Clark nick@ccl4.org Date: 2/1/2012 4:58:14 PM Message: Update, correct and clarify the comment in Perl_sv_setuv().

See the correspondence on ticket #36459 for more details.

AKA https://github.com/Perl/perl5/issues/8002

So questions, why is #define SvUOK_nog(sv) ((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV|SVs_GMG)) == (SVf_IOK|SVf_IVisUV)) used? isnt the byte size and bitfields for IV and UV 100% identical all CPUs OSes?

Doesnt SVf_IVisUV only matter for NV FP double conversion and stringify?

Does SV MAGIC somehow pass "want UV" vs "want IV" flags to the CPAN XS vtable or CPAN PP tied SCALARs?

Do we throw truncate/overflow console warnings for SvIV ptrs (SVf_IOK yes SVf_IVisUV no), passed to SvUV() getter macro, with the SVIV being "-1"? I can't find such code (XSUB level, not PP ops) in sv.c, but I will ask anyways.

This is bad enough a fix should go into maint releases.

PS, the problem in found in blead Perl 5 core XS and PP and runtime code. Because SvUV() is being used in some places to fetch pointer type integers.

Dirty fix, CPAN author side, always use setiv() SvIV() SvIVX() or ONLY use SvUVX() in XS code to store/get/read/set pointers and pointer like data, hell, just NEVER use SvUV() macro, since ONLY "(UV)-1" and "(UV)-123456" ever use the "fast path" in SvUV().

There needs to be a P5P fix for this. Vs manual hunting and patching of CPAN XS and core.

Steps to Reproduce

sv_setuv(); sv_dump(); SvUV(); and use a C step debugger inside SvUV(); statement

Expected behavior

Do not do a func call inside SvUV(); for SVIV flagged SV heads.

Perl configuration NA, blead perl 5.41 and many other versions

ericherman commented 3 hours ago

How much does:

    /* Using ivs is more efficient than using uvs - see sv_setuv */
    if (u <= (UV)IV_MAX) {
        return newSViv((IV)u);
    }

actually save?

Even though 4d55d9ac67bf8bc58f3fc9563b082459c6a3c22b states that this is "Pretty much the same change as bd30fe8921c88e4677c2279b442a56a11ae037b4", but that particular short-cut is in addition to that.

I wonder if it would be better to simply remove that.

ericherman commented 3 hours ago

I strongly suspect @tsee tested this with a micro-benchmark, but I do not know what benchmark was used.