Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.95k stars 554 forks source link

bodies_by_type[SVt_PVNV]: handle __float128 NV alignment on 32-bit #22609

Open tonycoz opened 1 month ago

tonycoz commented 1 month ago

On i686 systems with -msse __float128 requires 16 byte alignment, but for XPVNV bodies the hack used by new_body_allocated() to avoid allocating the unused xmg_stash and xmg_u fields means that the base of the XPVNV body ends up mis-aligned on 32-bit systems.

One 64-bit systems the combined size of those fields is 16-bytes so the modified pointer is still properly aligned.

Fixes #22577

perldelta something like the following in Selected bug fixes:

Builds with C<-msse> and quadmath on 32-bit x86 systems would crash with a misaligned access early in the build. [GH #22577]


jkeenan commented 1 month ago

@tonycoz, you've checked the box "This set of changes requires a perldelta entry, and it is included." But there are no changes to pod/perldelta.pod in the p.r. Can you clarify?

tonycoz commented 4 weeks ago

@tonycoz, you've checked the box "This set of changes requires a perldelta entry, and it is included." But there are no changes to pod/perldelta.pod in the p.r. Can you clarify?

See the last paragraph of the OP.

tonycoz commented 4 weeks ago

Significantly expanded the commit message to better explain how this stuff works and why it didn't for the failing builds.

bulk88 commented 6 days ago

struct XPVNV

    HV*     xmg_stash; 0X0
    union _xmgu xmg_u; 0X4
    STRLEN  xpv_cur; 0X8
    union { 
        STRLEN  xpvlenu_len;    0XC
    } xpv_len_u
    union _xivu xiv_u ; 0x10
    union _xnvu xnv_u; 0x14 BADDDDD

old code was doing

    STRLEN  xpv_cur; 0X0
    union { 
        STRLEN  xpvlenu_len;    0X4
    } xpv_len_u
    union _xivu xiv_u ; 0x8
    union _xnvu xnv_u; 0xC  BADDDDDD

so correct me if I am wrong, but the pool needs to start at

union _xmgu xmg_u;

and ghost field

    HV*     xmg_stash; 

???? im assuming 32b IV 32b ptr with 128b NV in the counts above

Or very invasive, idk pro cons didnt think deep abt it, make that 16BYTE monster field 1, so its order is BIGNV CUR LEN IV CLASS MG

ghost fielding at start and under allocing/ghost fielding at end, should still be possible, cuz IV and PV are top 2 freq cnts in Perl core right?

tonycoz commented 5 days ago

xmg_stash and xmg_u need to stay at the top of the structure so it's at the same offset for every SV variant, or if we move the NV to the top, it needs to be at the top for every SV variant, including those that don't use it.

We could do the same I'm not inclined to try to optimize too hard for __float128.

All the abbreviations "idk pro cons abt" made this more difficult to follow.

bulk88 commented 5 days ago

same here, I wouldnt optimize very hard for __float128, its an exotic build option, and no cpu in the world currently implements the type in hardware, so __float128 is always faked with func calls to libc for every operator.

bulk88 commented 4 days ago

I dont see anything very wrong with this current commit, I recommend applying it to blead.

Only concern is I see the possibility of sneaking in a 32b wide ghost field at the start SV head, while this current commit doesnt do that. If thinking about the logistics of that 32b wide ghost field, is more than 1 minute of thinking, forget the ghost field idea and apply that patch to blead.

Ummm, are the broken builds with 32b pointer with __float128 NV, do those broken builds, have IV 64b or IV 32b?

My struct offset math examples above assume 32b IV 32b pointer, and 128b NV.

Research on float128 build option, best case 100x longer runtime on a benchmark with Intel C, typical case 1000x longer runtime on a benchmark with GCC. The terms performance and optimization are non-existent with that build option. Any Configure/Makefile script should show a lung cancer warning from a cigarette carton if a user picks that build flag. Supporting the build flag is good, having that build flag be default on, anywhere, absolutely never. A limited amount of users, who know why they want float128, it is correct to support them and their build flag.

https://stackoverflow.com/questions/27771354/performance-benchmark-of-float128-type