Closed GoogleCodeExporter closed 9 years ago
Simplified:
perlcc -r -e 'use constant ASCII => eval { require Encode;
Encode::find_encoding('ASCII'); } || 0; ASCII->encode("www.google.com")
if(ASCII)'; ./a.out
zsh: segmentation fault ./a.out
Original comment by todd.e.rinaldo
on 19 Mar 2014 at 2:14
I can't seem to remove anything else and make it break.
Original comment by todd.e.rinaldo
on 19 Mar 2014 at 2:25
Original comment by todd.e.rinaldo
on 19 Mar 2014 at 4:42
work in branch `init2-encode-issue305` based on the assumption that the XS
stash for Encode::XS holds an AMT magic (tiehash), which is initialized to
early, because Xs are loaded later. So defer stash init of AMT magic to init2.
Original comment by reini.urban
on 20 Mar 2014 at 7:02
AMT magic initialization was not the problem. The real problem is much worse.
NET::DNS::Domain dois
use constant ASCII => eval {
require Encode;
Encode::find_encoding('ASCII'); # return encoding object
} || 0;
which stores the compile-time pointer to the encoding object in the binary, but
when
re-initializing Encode the pointer is invalid. We cannot store pointers to
dynamic XS objects generally.
I'm not even sure how to detect such ptrs generally, checking the memory ranges
against ranges
of shared libraries.
{ Nullhv, {0}, 0, 0, {140387109050112UL}, {0.00} }, /* xpvmg_list[0] */
140387109050112 needs to re-initialized after Encode::XS was loaded.
Original comment by reini.urban
on 20 Mar 2014 at 8:32
The problem is not solvable. Not even a XS symbol needs to be involved.
If the pointer is just a ptr to a hash, it will break when being compiled.
Encoding::getEncoding
exists $Encoding{$name} and return $Encoding{$name};
In our case it comes from Encode_XSEncoding(&ascii_encoding)
HV *stash = gv_stashpv("Encode::XS", TRUE);
SV *sv = sv_bless(newRV_noinc(newSViv(PTR2IV(enc))), stash);
and we could special-case such XS modules. But I guess most XS modules store
their objects this way.
Original comment by reini.urban
on 20 Mar 2014 at 8:54
In our case the perlcc fix would be:
init2:
#include <dlfcn.h>
void *handle = dlopen(sv_list[5032].sv_u.svu_pv, RTLD_NOW|RTLD_NOLOAD);
void *ascii_encoding = dlsym(handle, "ascii_encoding");
SvIV_set(&sv_list[1], PTR2IV(ascii_encoding));
The better fix is of course to fix NET::DNS::Domain to defer use constant ...
to run-time by doing it in an INIT{} block.
Original comment by reini.urban
on 20 Mar 2014 at 9:12
See https://rt.cpan.org/Public/Bug/Display.html?id=94069
and
https://github.com/rurban/distroprefs/commit/b38336e84df3cb0183a0da189ef2636b7a0
474c4 for automated patches for Net-DNS 0.67-0.74
Original comment by reini.urban
on 20 Mar 2014 at 9:56
I decided to try to run-time patch/remap those pointers. See branch
init2-encode-issue305
Original comment by reini.urban
on 22 Mar 2014 at 6:07
So far mandatory remapping is needed on Net-DNS 0.67-0.74
and optionally for correctness on 4 Encode ptrs: ascii, iso8859_1, null,
ascii_ctrl
Not sure yet about more XS modules.
Original comment by reini.urban
on 22 Mar 2014 at 9:09
TODO:
Working with the Net:::DNS maintainers to fix it in Net:::DNS (not yet
successful, my 1st patch is wrong, theirs also)
and testing init2-encode-issue305 on non d_dlopen systems.
Original comment by reini.urban
on 25 Mar 2014 at 3:32
I have the fix and will supply it to them
Original comment by todd.e.rinaldo
on 25 Mar 2014 at 4:12
I have the fix and will supply it to them
Original comment by todd.e.rinaldo
on 25 Mar 2014 at 4:12
Toddr: Your fix does not help the performance problem they had.
They want ASCII/UTF8 to be evaluated at compile-time, so on ASCII/UTF8 systems
the relevant if checks will be constant folded.
My first fix did the same.
BEGIN { sub ... } would be the equivalent to use constant,
because they need to be stored as CONSTSUB, not as CV.
However I still see the segv because the Encode ptrs are stored and used.
Still working on it.
Original comment by reini.urban
on 26 Mar 2014 at 3:28
See https://rt.cpan.org/Public/Bug/Display.html?id=94221 and the new branch
encode-name which detects new PVs fort the symbol names attached to such objects
Original comment by reini.urban
on 26 Mar 2014 at 7:50
Todd: Dick came up with a good Net::DNS patch, we can use.
See https://rt.cpan.org/Public/Bug/Display.html?id=94069
Original comment by reini.urban
on 27 Mar 2014 at 5:50
Merged branches
* init2-encode-issue305 (heuristics to detect Encode::XS) and
* encode-name: detect new Encode 2.58 with my patch:
https://rt.cpan.org/Public/Bug/Display.html?id=94221
The external encodings now also store their name (for dlsym) along the IV ptr.
As SVp_POK PVX (private only)
Original comment by reini.urban
on 31 Mar 2014 at 2:27
Encode-2.59 and 2.60 broke issue 305 again. Will try to work around it.
Original comment by reini.urban
on 9 May 2014 at 8:03
Work in branch encode_new
Original comment by reini.urban
on 11 May 2014 at 4:05
Fixed with commit 3b2217eeb86f9e4dab8853aed123f14c75dd22fb
Author: Reini Urban <rurban@cpanel.net>
Date: Wed May 14 20:05:03 2014 -0500
C: better foreign Encoding support (#305)
special-case the Encode bootstrap XS methods, which
were not saved without.
Original comment by reini.urban
on 16 May 2014 at 7:52
Had to add another 32bit fix with
commit db347307388cfaf46258e776da451ddd5b789130
Author: Reini Urban <rurban@cpanel.net>
Date: Tue May 20 15:38:41 2014 -0500
C encode_new: improve patch_dlsym heuristic
32bit systems have much lower shared object pointer values. around the image_base
Original comment by reini.urban
on 20 May 2014 at 9:15
Original issue reported on code.google.com by
todd.e.rinaldo
on 19 Mar 2014 at 1:43