Open p5pRT opened 12 years ago
Running the sympa mailing list software via its fastcgi module wwsympa.fcgi\,
when I perform various functions like subscribe/unsubscribe that cause an e-mail to be generated\, I get a segfault in libperl.so
Another person has reported this under openSUSE\, same version of perl. It formerly worked in perl 5.12.
E-mail thread: https://listes.cru.fr/sympa/arc/sympa-users/2012-01/msg00005.html
I am using Fedora 16\, stock distribution perl\, modules installed via CPAN.
Enabled crash dumps\, shows that in all of the different cases\, it's crashing on the same line\, mg.c#232
I can send full crash dumps if needed\, as well as result of -d:Trace. Here is snippet of relevant bt:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/perl -U -d:Trace
/home/sympa/bin/wwsympa.fcgi'.
Program terminated with signal 11\, Segmentation fault.
#0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232
232 if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) {
(gdb) bt
#0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232
#1 0x00007fe01a8467e1 in Perl_sv_setsv_flags (my_perl=0xbd3010\, dstr=0x68ebe08\, sstr=0x61ac888\, flags=18) at sv.c:4097
#2 0x00007fe01a84754b in Perl_newSVsv (my_perl=0xbd3010\, old=\
(gdb) print mg $1 = (MAGIC *) 0xbd6d28 (gdb) print mg->mg_flags $2 = 0 '\000' (gdb) print vtbl $3 = (const MGVTBL * const) 0x2200000c00000003 (gdb) print vtbl->svt_get Cannot access memory at address 0x2200000c00000003 (gdb) print mg->mg_virtual $4 = (MGVTBL *) 0x2200000c00000003
Alignment bug? Stepping on freed memory? Thanks for any help! I can provide any support. This is 100% reproducible for me.
- Erik
On Tue Jan 03 11:43:51 2012\, erik@thekrib.com wrote:
Running the sympa mailing list software via its fastcgi module wwsympa.fcgi\,
when I perform various functions like subscribe/unsubscribe that cause an e-mail to be generated\, I get a segfault in libperl.so
Another person has reported this under openSUSE\, same version of perl. It formerly worked in perl 5.12.
E-mail thread: https://listes.cru.fr/sympa/arc/sympa-users/2012- 01/msg00005.html
I am using Fedora 16\, stock distribution perl\, modules installed via CPAN.
Enabled crash dumps\, shows that in all of the different cases\, it's crashing on the same line\, mg.c#232
I can send full crash dumps if needed\, as well as result of -d:Trace. Here is snippet of relevant bt:
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/bin/perl -U -d:Trace /home/sympa/bin/wwsympa.fcgi'. Program terminated with signal 11\, Segmentation fault. #0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232 232 if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) { (gdb) bt #0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232 #1 0x00007fe01a8467e1 in Perl_sv_setsv_flags (my_perl=0xbd3010\, dstr=0x68ebe08\, sstr=0x61ac888\, flags=18) at sv.c:4097 #2 0x00007fe01a84754b in Perl_newSVsv (my_perl=0xbd3010\, old=\
) at sv.c:8733 #3 0x00007fe01a7fd3d6 in S_pad_findlex (my_perl=0xbd3010\, name=\ \, cv=\ \, seq=\ \, warn=1\, out_capture=0x0\, [...] (gdb) print mg $1 = (MAGIC *) 0xbd6d28 (gdb) print mg->mg_flags $2 = 0 '\000' (gdb) print vtbl $3 = (const MGVTBL * const) 0x2200000c00000003 (gdb) print vtbl->svt_get Cannot access memory at address 0x2200000c00000003 (gdb) print mg->mg_virtual $4 = (MGVTBL *) 0x2200000c00000003
Alignment bug? Stepping on freed memory? Thanks for any help! I can provide any support. This is 100% reproducible for me.
I donāt really know where to begin. But you could try
call Perl_warn(my_perl\, "here")
in gdb to see which piece of Perl code itās crashing on.
Is there any chance you could do a binary search to find out which change was responsible?
--
Father Chrysostomos
The RT System itself - Status changed from 'new' to 'open'
On Tue\, 3 Jan 2012\, Father Chrysostomos via RT wrote:
On Tue Jan 03 11:43:51 2012\, erik@thekrib.com wrote:
Running the sympa mailing list software via its fastcgi module wwsympa.fcgi\,
when I perform various functions like subscribe/unsubscribe that cause an e-mail to be generated\, I get a segfault in libperl.so
Another person has reported this under openSUSE\, same version of perl. It formerly worked in perl 5.12.
E-mail thread: https://listes.cru.fr/sympa/arc/sympa-users/2012- 01/msg00005.html
I am using Fedora 16\, stock distribution perl\, modules installed via CPAN.
Enabled crash dumps\, shows that in all of the different cases\, it's crashing on the same line\, mg.c#232
I can send full crash dumps if needed\, as well as result of -d:Trace. Here is snippet of relevant bt:
[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/bin/perl -U -d:Trace /home/sympa/bin/wwsympa.fcgi'. Program terminated with signal 11\, Segmentation fault. #0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232 232 if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) { (gdb) bt #0 0x00007fe01a823821 in Perl_mg_get (my_perl=0xbd3010\, sv=0x61ac888) at mg.c:232 #1 0x00007fe01a8467e1 in Perl_sv_setsv_flags (my_perl=0xbd3010\, dstr=0x68ebe08\, sstr=0x61ac888\, flags=18) at sv.c:4097 #2 0x00007fe01a84754b in Perl_newSVsv (my_perl=0xbd3010\, old=\
) at sv.c:8733 #3 0x00007fe01a7fd3d6 in S_pad_findlex (my_perl=0xbd3010\, name=\ \, cv=\ \, seq=\ \, warn=1\, out_capture=0x0\, [...] (gdb) print mg $1 = (MAGIC *) 0xbd6d28 (gdb) print mg->mg_flags $2 = 0 '\000' (gdb) print vtbl $3 = (const MGVTBL * const) 0x2200000c00000003 (gdb) print vtbl->svt_get Cannot access memory at address 0x2200000c00000003 (gdb) print mg->mg_virtual $4 = (MGVTBL *) 0x2200000c00000003
Alignment bug? Stepping on freed memory? Thanks for any help! I can provide any support. This is 100% reproducible for me.
I donāt really know where to begin. But you could try
call Perl_warn(my_perl\, "here")
in gdb to see which piece of Perl code itās crashing on.
Is there any chance you could do a binary search to find out which change was responsible?
Thank you for responding so quickly!
Because this is run out out of a cgi script\, I'm relying on core dumps as the process segfaults. So the process is gone. I tried attaching\, and didn't get anything from call Perl_warn(my_perl\, "here").
But I did some more digging in one of the core dumps. This caught my eye...
(gdb) frame 0 #0 0x00007fbff7a7f821 in Perl_mg_get (my_perl=0x1d4f010\, sv=0x5e8d328) at mg.c:232 232 if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl->svt_get) { (gdb) print mg $3 = (MAGIC *) 0x1d52bd8 (gdb) print *mg $4 = {mg_moremagic = 0x1d57620\, mg_virtual = 0x2200000c00000003\, mg_private = 39744\, mg_type = -42 '\326'\, mg_flags = 1 '\001'\, mg_len = 0\, mg_obj = 0x1d651a0\, mg_ptr = 0x800900000001 \<Address 0x800900000001 out of bounds>}
**** These look to be bogus values for mg structure -- in particular\, mg_virtual\, which is unlike all other pointers I've seen. **** mg is obtained as sv->sv_any->xmg_magic
(gdb) print sv $5 = (SV *) 0x5e8d328 (gdb) print *sv $6 = {sv_any = 0x58ef6a0\, sv_refcnt = 2\, sv_flags = 1074021383\, sv_u = {svu_pv = 0x5fd8450 "%alias"\, svu_iv = 100500560\, svu_uv = 100500560\, svu_rv = 0x5fd8450\, svu_array = 0x5fd8450\, svu_hash = 0x5fd8450\, svu_gp = 0x5fd8450\, svu_fp = 0x5fd8450}}
sv_flags in hex is = 40044407 -- includes svpad_name and svpad_our flags.
(gdb) print sv->sv_any $7 = (void *) 0x58ef6a0 (gdb) print (XPVMG *)0x58ef6a0 $8 = (XPVMG *) 0x58ef6a0 (gdb) print *((XPVMG *)0x58ef6a0) $9 = {xmg_stash = 0x0\, xmg_u = {xmg_magic = 0x1d52bd8\, xmg_ourstash = 0x1d52bd8}\, xpv_cur = 6\, xpv_len = 16\, xiv_u = {xivu_iv = 0\, xivu_uv = 0\, xivu_i32 = 0\, xivu_namehek = 0x0}\, xnv_u = {xnv_nv = -nan(0xfffff00009f5a)\, xgv_stash = 0xffffffff00009f5a\, xpad_cop_seq = {xlow = 40794\, xhigh = 4294967295}\, xbm_s = {xbm_previous = 40794\, xbm_flags = 255 '\377'\, xbm_rare = 255 '\377'}}}
sv->sv_any looks good.
So based on xmg_u being a union\, I tried to look at xmg_u as xmg_ourstash
(gdb) print *((HV *)0x1d52bd8) $11 = {sv_any = 0x1d57620\, sv_refcnt = 3\, sv_flags = 570425356\, sv_u = {svu_pv = 0x1d69b40 ""\, svu_iv = 30841664\, svu_uv = 30841664\, svu_rv = 0x1d69b40\, svu_array = 0x1d69b40\, svu_hash = 0x1d69b40\, svu_gp = 0x1d69b40\, svu_fp = 0x1d69b40}}
The code in Perl_mg_get() is walking that link to xmg_magic without checking the flags first?
- Erik
-- Erik Olson Sent from my spiffy new Linux box
On Wed Jan 04 08:10:13 2012\, erik@thekrib.com wrote:
But I did some more digging in one of the core dumps. This caught my eye...
(gdb) frame 0 #0 0x00007fbff7a7f821 in Perl_mg_get (my_perl=0x1d4f010\, sv=0x5e8d328) at mg.c:232 232 if (!(mg->mg_flags & MGf_GSKIP) && vtbl && vtbl-
svt_get) { (gdb) print mg $3 = (MAGIC *) 0x1d52bd8 (gdb) print *mg $4 = {mg_moremagic = 0x1d57620\, mg_virtual = 0x2200000c00000003\, mg_private = 39744\, mg_type = -42 '\326'\, mg_flags = 1 '\001'\, mg_len = 0\, mg_obj = 0x1d651a0\, mg_ptr = 0x800900000001 \<Address 0x800900000001 out of bounds>}
**** These look to be bogus values for mg structure -- in particular\, mg_virtual\, which is unlike all other pointers I've seen. **** mg is obtained as sv->sv_any->xmg_magic
That entire magic structure is garbage. There is no magic type '\326'. All magic types apart from '\0' are ASCII characters.
(gdb) print sv $5 = (SV *) 0x5e8d328 (gdb) print *sv $6 = {sv_any = 0x58ef6a0\, sv_refcnt = 2\, sv_flags = 1074021383\, sv_u = {svu_pv = 0x5fd8450 "%alias"\, svu_iv = 100500560\, svu_uv = 100500560\, svu_rv = 0x5fd8450\, svu_array = 0x5fd8450\, svu_hash = 0x5fd8450\, svu_gp = 0x5fd8450\, svu_fp = 0x5fd8450}}
sv_flags in hex is = 40044407 -- includes svpad_name and svpad_our flags.
SVt_PVMG\, SVf_POK\, SVp_POK\, SVpad_OUR\, SVpad_NAME
And the call is coming from S_pad_findlex. Was pad.c:1035 the line number that got cut off? (Thatās the only instance of newSVsv in pad.c.)
But S_pad_find_lex is trying to create a new SV from an existing one (newSVsv). The existing sv seems to have a corrupt magic pointer\, since the call is coming from here in sv_setsv_flags:
case SVt_PVMG: if (SvGMAGICAL(sstr) && (flags & SV_GMAGIC)) { --> mg_get(sstr); if (SvTYPE(sstr) != stype) stype = SvTYPE(sstr); }
(gdb) print sv->sv_any $7 = (void *) 0x58ef6a0 (gdb) print (XPVMG *)0x58ef6a0 $8 = (XPVMG *) 0x58ef6a0 (gdb) print *((XPVMG *)0x58ef6a0) $9 = {xmg_stash = 0x0\, xmg_u = {xmg_magic = 0x1d52bd8\, xmg_ourstash = 0x1d52bd8}\, xpv_cur = 6\, xpv_len = 16\, xiv_u = {xivu_iv = 0\, xivu_uv = 0\, xivu_i32 = 0\, xivu_namehek = 0x0}\, xnv_u = {xnv_nv = -nan(0xfffff00009f5a)\, xgv_stash = 0xffffffff00009f5a\, xpad_cop_seq = {xlow = 40794\, xhigh = 4294967295}\, xbm_s = {xbm_previous = 40794\, xbm_flags = 255 '\377'\, xbm_rare = 255 '\377'}}}
sv->sv_any looks good.
So based on xmg_u being a union\, I tried to look at xmg_u as xmg_ourstash
Iāve just learn something new. sv.h has this:
union _xmgu { MAGIC* xmg_magic; /* linked list of magicalness */ HV* xmg_ourstash; /* Stash for our (when SvPAD_OUR is true) */ };
(gdb) print *((HV *)0x1d52bd8) $11 = {sv_any = 0x1d57620\, sv_refcnt = 3\, sv_flags = 570425356\, sv_u = {svu_pv = 0x1d69b40 ""\, svu_iv = 30841664\, svu_uv = 30841664\, svu_rv = 0x1d69b40\, svu_array = 0x1d69b40\, svu_hash = 0x1d69b40\, svu_gp = 0x1d69b40\, svu_fp = 0x1d69b40}}
The code in Perl_mg_get() is walking that link to xmg_magic without checking the flags first?
mg_get seems to be designed that way. But the calling code checks SvGMAGICAL first; i.e.\, SvFLAGS(sv) & 0x00200000.
0x00200000 # SVs_GMG 0x40044407 # SvFLAGS(sv)
Maybe this is a compiler bug. I really canāt tell.
Are you in a position to recompile perl? You could try SvGMAGICAL(sstr) ? 1 : 0 or SvGMAGICAL(sstr) != 0 or !!SvGMAGICAL(sstr).
Or what happens if you reinstall perl unmodified?
(When it comes to this compilation stuff\, Iām just waving my hands around. I really donāt understand it that much.)
Also: Is it possible for you to run the CGI script yourself from the command line? Getting it down to something reproducible is usually extremely helpful\, as you can imagine.
--
Father Chrysostomos
Update of information on this bug.
I have spent the last two evenings trying to make progress. It is very frustrating.
I was able to repro the problems on a virtual machine with an all-new installation (clean) of the OS {fedora 16\, 64-bit}. I also confirmed the problem does NOT happen with the previous release of the OS {fedora 15\, 64-bit\, with updates} -- which is using perl 5.12 -- installed on another VM.
On Wed\, 4 Jan 2012\, Father Chrysostomos via RT wrote:
And the call is coming from S_pad_findlex. Was pad.c:1035 the line number that got cut off? (Thatās the only instance of newSVsv in pad.c.)
Yes\, line 1035 definitely.
Are you in a position to recompile perl?
Yes\, I have broken through the barrier of DBD::mysql not building (patched it here https://rt.cpan.org/Public/Bug/Display.html?id=68112 and now everything builds).
I built it with -DDEBUGGING turned on\, and very interestingly\, it asserts on first invocation of the fastcgi script via httpd. (i.e.\, this is even BEFORE I attempt to send it the command that page-faults the script.)
perl: mg.c:227: Perl_mg_get: Assertion `!(((_svmagic)->sv_flags & (0x40000000|0x00040000)) == (0x40000000|0x00040000))' failed.
I beleive this corresponds to the second assert in this macro:
# define SvMAGIC(sv) \ (*({ const SV *const _svmagic = (const SV *)(sv); \ assert(SvTYPE(_svmagic) >= SVt_PVMG); \ if(SvTYPE(_svmagic) == SVt_PVMG) \ assert(!SvPAD_OUR(_svmagic)); \ \<----- This one &(((XPVMG*) MUTABLE_PTR(SvANY(_svmagic)))->xmg_u.xmg_magic); \ }))
Is this opposite case of the one that page-faults the machine?
Unfortunately\, because of that assert\, I cannot attempt to run the command that segfaults the machine.
I ran this with perl tracing on\, and here are the last few lines before the assert:
[Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:45: my $args = @_ or @_ = @$exports; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:47: local $_; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:48: if ($args and not %$export_cache) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:52: my $heavy; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:55: if ($args or $fail) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:58: foreach (@_); [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:63: return export $pkg\, $callpkg\, ($args ? @_ : ()) if $heavy;
Full Devel::trace at http://test.thekrib.com/trace-107480-debug.txt.gz
You could try SvGMAGICAL(sstr) ? 1 : 0 or SvGMAGICAL(sstr) != 0 or !!SvGMAGICAL(sstr).
Tried these\, to no avail. It still gets through. I don't think this is a compiler bug. I think the SvGMAGICAL flag *is* set when it enters that function\, and somewhere in the function called on line 221:
save_magic(mgs_ix\, sv);
the flag value is reset along with who-knows-what.
Also: Is it possible for you to run the CGI script yourself from the command line? Getting it down to something reproducible is usually extremely helpful\, as you can imagine.
Not possible\, due to its interaction\, but I can use gdb to attach to the process while it's still behaving\, and then crash it into the debugger rather than dumping a core. I will see if I can add some breakpoints higher up and catch it in the act of processing the bogus sv.
I am at a disadvantage here because I'm not that familiar with even the perl language. I can use my C skills to tell what's happening\, but can't understand WHY. :)
Thanks for any insight.
- Erik
-- Erik Olson Sent from my spiffy new Linux box
On Sat Jan 07 04:20:22 2012\, erik@thekrib.com wrote:
Update of information on this bug.
I built it with -DDEBUGGING turned on\, and very interestingly\, it asserts on first invocation of the fastcgi script via httpd. (i.e.\, this is even BEFORE I attempt to send it the command that page-faults the script.)
perl: mg.c:227: Perl_mg_get: Assertion `!(((_svmagic)->sv_flags & (0x40000000|0x00040000)) == (0x40000000|0x00040000))' failed.
I beleive this corresponds to the second assert in this macro:
# define SvMAGIC(sv) \ (*({ const SV *const _svmagic = (const SV *)(sv); \ assert(SvTYPE(_svmagic) >= SVt_PVMG); \ if(SvTYPE(_svmagic) == SVt_PVMG) \ assert(!SvPAD_OUR(_svmagic)); \ \<----- This one &(((XPVMG*) MUTABLE_PTR(SvANY(_svmagic)))-
xmg_u.xmg_magic); \ }))
Is this opposite case of the one that page-faults the machine?
It sounds like exactly the same problem to me.
Unfortunately\, because of that assert\, I cannot attempt to run the command that segfaults the machine.
I ran this with perl tracing on\, and here are the last few lines before the assert:
[Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:45: my $args = @_ or @_ = @$exports; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:47: local $_; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:48: if ($args and not %$export_cache) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:52: my $heavy; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:55: if ($args or $fail) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:58: foreach (@_); [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:63: return export $pkg\, $callpkg\, ($args ? @_ : ()) if $heavy;
Full Devel::trace at http://test.thekrib.com/trace-107480-debug.txt.gz
Due to local firewall settings\, I cannot access that URL. Could you e-mail it to sprout at cpan dot org? (Donāt worry about size limits.)
I am at a disadvantage here because I'm not that familiar with even the perl language. I can use my C skills to tell what's happening\, but can't understand WHY. :)
Iām in the opposite position. Iām a Perl and JavaScript programmer who learnt C by hacking on the perl internals. Iām still learning it.
--
Father Chrysostomos
On Fri\, Jan 06\, 2012 at 10:33:48PM -0800\, Erik Olson wrote:
Tried these\, to no avail. It still gets through. I don't think this is a compiler bug. I think the SvGMAGICAL flag *is* set when it enters that function\, and somewhere in the function called on line 221:
save\_magic\(mgs\_ix\, sv\);
the flag value is reset along with who-knows-what.
Yes\, this is normal behaviour. I think the problem is that an SV in the pad that holds the name of a lexical variable is expected never to have magic attached to it\, which is why the xmg_magic and xmg_ourstash fields are in fact a union. So something naughty somewhere is adding magic to a name SV.
The diff below to 5.14.2 adds an assertion that should catch that happening. Can you apply it and retest? Thanks.
"I do not resent criticism\, even when\, for the sake of emphasis\, it parts for the time with reality". -- Winston Churchill\, House of Commons\, 22nd Jan 1941.
On Sun Jan 08 14:14:19 2012\, sprout wrote:
On Sat Jan 07 04:20:22 2012\, erik@thekrib.com wrote:
Update of information on this bug.
I built it with -DDEBUGGING turned on\, and very interestingly\, it asserts on first invocation of the fastcgi script via httpd. (i.e.\, this is even BEFORE I attempt to send it the command that page-faults the script.)
perl: mg.c:227: Perl_mg_get: Assertion `!(((_svmagic)->sv_flags & (0x40000000|0x00040000)) == (0x40000000|0x00040000))' failed.
Unfortunately\, because of that assert\, I cannot attempt to run the command that segfaults the machine.
Can you also provide a full gdb backtrace from that assertion?
And also\, back to my earlier question: Since the problem begins (i.e.\, something goes screwy) even before you sent the script the fatal command\, is it possible to run it on the command line (with the DEBUGGING build)\, and trace whatās happening to the SV in question\, and where?
I ran this with perl tracing on\, and here are the last few lines before the assert:
[Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:45: my $args = @_ or @_ = @$exports; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:47: local $_; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:48: if ($args and not %$export_cache) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:52: my $heavy; [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:55: if ($args or $fail) { [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:58: foreach (@_); [Thu Jan 05 20:31:05 2012] [warn] [client ::1] mod_fcgid: stderr: >> /opt/perl/lib/5.14.2/Exporter.pm:63: return export $pkg\, $callpkg\, ($args ? @_ : ()) if $heavy;
Full Devel::trace at http://test.thekrib.com/trace-107480-debug.txt.gz
Due to local firewall settings\, I cannot access that URL. Could you e-mail it to sprout at cpan dot org? (Donāt worry about size limits.)
Iāve had a look at it\, but I really donāt know what to look for. I wish I could be of more help.
I am at a disadvantage here because I'm not that familiar with even the perl language. I can use my C skills to tell what's happening\, but can't understand WHY. :)
Iām in the opposite position. Iām a Perl and JavaScript programmer who learnt C by hacking on the perl internals. Iām still learning it.
--
Father Chrysostomos
gzipped perl trace rejected by mailing list\, will send separately if you want. Core dump backtrace hopefully small enough.
---------- Forwarded message ---------- From: Erik Olson \erik@​thekrib\.com To: Dave Mitchell \davem@​iabyn\.com Cc: Father Chrysostomos via RT \perlbug\-followup@​perl\.org Date: Mon\, 9 Jan 2012 22:09:04 Subject: Re: [perl #107480] segmentation fault in mg.c on Fedora 16-x64
On Mon\, 9 Jan 2012\, Dave Mitchell wrote:
The diff below to 5.14.2 adds an assertion that should catch that happening. Can you apply it and retest? Thanks.
--- sv.h.orig 2012-01-08 23:43:14.352614694 +0000 +++ sv.h 2012-01-08 23:52:00.791663213 +0000 @@ -1214\,6 +1214\,8 @@ ((sv)->sv_u.svu_rv = (val)); } STMT_END #define SvMAGIC_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \ + assert(!(SvFLAGS(sv) & SVpad_NAME) || \ + (SvTYPE(sv) != SVt_PVNV && SvTYPE(sv) != SVt_PVMG)); \ (((XPVMG*)SvANY(sv))->xmg_u.xmg_magic = (val)); } STMT_END #define SvSTASH_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \
Yes\, I was able to get it to assert on this as I fired up the cgi script for the first time\, mg.c:589
SvMAGIC_set(sv\, moremagic);
Like the previous assert alluded to on mg.c:227\, this happens before I even run the offending command that crashes the system.
Full bt of core dump enclosed. Also full Devel::Trace of the perl statements enclosed (though I wonder if the last statements actually get logged when it chokes...).
I can keep running tests like this\, as much as it takes.
I have worked around the problem locally by recompiling perl 5.12.x\, so time constraint for me is lifted. But I don't think anyone on the sympa team is investigating this.
- Erik
-- Erik Olson Sent from my spiffy new Linux box
#0 0x00000032ec036285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
resultvar = 0
pid = \
On Mon\, 9 Jan 2012\, Father Chrysostomos via RT wrote:
Can you also provide a full gdb backtrace from that assertion?
Sent a backtrace from the new assertion. Let's see if that helps.
And also\, back to my earlier question: Since the problem begins (i.e.\, something goes screwy) even before you sent the script the fatal command\, is it possible to run it on the command line (with the DEBUGGING build)\, and trace whatās happening to the SV in question\, and where?
Nope\, when I run it straight from the commandline\, it exits out "normally" without hitting the assert\, most likely because it knows it's not being launched by apache's fastcgi module.
-- Erik Olson Sent from my spiffy new Linux box
On Mon\, Jan 09\, 2012 at 10:09:04PM -0800\, Erik Olson wrote:
On Mon\, 9 Jan 2012\, Dave Mitchell wrote:
The diff below to 5.14.2 adds an assertion that should catch that happening. Can you apply it and retest? Thanks.
--- sv.h.orig 2012-01-08 23:43:14.352614694 +0000 +++ sv.h 2012-01-08 23:52:00.791663213 +0000 @@ -1214\,6 +1214\,8 @@ ((sv)->sv_u.svu_rv = (val)); } STMT_END #define SvMAGIC_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \ + assert(!(SvFLAGS(sv) & SVpad_NAME) || \ + (SvTYPE(sv) != SVt_PVNV && SvTYPE(sv) != SVt_PVMG)); \ (((XPVMG*)SvANY(sv))->xmg_u.xmg_magic = (val)); } STMT_END #define SvSTASH_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \
Yes\, I was able to get it to assert on this as I fired up the cgi script for the first time\, mg.c:589
SvMAGIC\_set\(sv\, moremagic\);
Like the previous assert alluded to on mg.c:227\, this happens before I even run the offending command that crashes the system.
Full bt of core dump enclosed. Also full Devel::Trace of the perl statements enclosed (though I wonder if the last statements actually get logged when it chokes...).
The assertion failing there indicates that the assertion code I gave you was wrong :-( Could you try the revised one below instead\, thanks. Also\, the output of the trace indicated that the fastcgi environment was running under -t\, so it might be that you'll be able to reproduce the failure from the command line by running with perl -t ...
No matter how many dust sheets you use\, you will get paint on the carpet.
On Wed Jan 11 10:02:29 2012\, davem wrote:
On Mon\, Jan 09\, 2012 at 10:09:04PM -0800\, Erik Olson wrote:
On Mon\, 9 Jan 2012\, Dave Mitchell wrote:
The diff below to 5.14.2 adds an assertion that should catch that happening. Can you apply it and retest? Thanks.
--- sv.h.orig 2012-01-08 23:43:14.352614694 +0000 +++ sv.h 2012-01-08 23:52:00.791663213 +0000 @@ -1214\,6 +1214\,8 @@ ((sv)->sv_u.svu_rv = (val)); } STMT_END #define SvMAGIC_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \ + assert(!(SvFLAGS(sv) & SVpad_NAME) || \ + (SvTYPE(sv) != SVt_PVNV && SvTYPE(sv) != SVt_PVMG)); \ (((XPVMG*)SvANY(sv))->xmg_u.xmg_magic = (val)); } STMT_END #define SvSTASH_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \
Yes\, I was able to get it to assert on this as I fired up the cgi script for the first time\, mg.c:589
SvMAGIC\_set\(sv\, moremagic\);
Like the previous assert alluded to on mg.c:227\, this happens before I even run the offending command that crashes the system.
Full bt of core dump enclosed. Also full Devel::Trace of the perl statements enclosed (though I wonder if the last statements actually get logged when it chokes...).
The assertion failing there indicates that the assertion code I gave you was wrong :-( Could you try the revised one below instead\, thanks. Also\, the output of the trace indicated that the fastcgi environment was running under -t\, so it might be that you'll be able to reproduce the failure from the command line by running with perl -t ...
If the script wonāt run without a CGI environment\, it might be sufficient to set REQUEST_METHOD:
$ REQUEST_METHOD=GET perl5.14.2 -t wwsympa.fcgi
--- sv.h.orig 2012-01-08 23:43:14.352614694 +0000 +++ sv.h 2012-01-11 17:41:01.957571210 +0000 @@ -1214\,6 +1214\,8 @@ ((sv)->sv_u.svu_rv = (val)); } STMT_END #define SvMAGIC_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \ + assert(!SvPAD_OUR(sv) || \ + (SvTYPE(sv) != SVt_PVNV && SvTYPE(sv) != SVt_PVMG)); \ (((XPVMG*)SvANY(sv))->xmg_u.xmg_magic = (val)); } STMT_END #define SvSTASH_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \
--
Father Chrysostomos
On Wed\, 11 Jan 2012\, Dave Mitchell wrote:
The assertion failing there indicates that the assertion code I gave you was wrong :-( Could you try the revised one below instead\, thanks. Also\, the output of the trace indicated that the fastcgi environment was running under -t\, so it might be that you'll be able to reproduce the failure from the command line by running with perl -t ...
Indeed\, this DOES now allow me to run it on the commandline -- thanks!
--- sv.h.orig 2012-01-08 23:43:14.352614694 +0000 +++ sv.h 2012-01-11 17:41:01.957571210 +0000 @@ -1214\,6 +1214\,8 @@ ((sv)->sv_u.svu_rv = (val)); } STMT_END #define SvMAGIC_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \ + assert(!SvPAD_OUR(sv) || \ + (SvTYPE(sv) != SVt_PVNV && SvTYPE(sv) != SVt_PVMG)); \ (((XPVMG*)SvANY(sv))->xmg_u.xmg_magic = (val)); } STMT_END #define SvSTASH_set(sv\, val) \ STMT_START { assert(SvTYPE(sv) >= SVt_PVMG); \
Unfortunately\, with this new assert added\, I'm back to the "original" assert I was getting on first invocation of debug perl: Perl_mg_get\, mg.c:227
Anything I can dig up here?
Oh\, full backtrace:
(gdb) bt
#0 0x00000032ec036285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00000032ec037b9b in __GI_abort () at abort.c:91
#2 0x00000032ec02ee9e in __assert_fail_base (fmt=\
(gdb) bt full
#0 0x00000032ec036285 in __GI_raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
resultvar = 0
pid = \
-- Erik Olson Sent from my spiffy new Linux box
On Wed\, Jan 11\, 2012 at 04:18:57PM -0800\, Erik Olson wrote:
Unfortunately\, with this new assert added\, I'm back to the "original" assert I was getting on first invocation of debug perl: Perl_mg_get\, mg.c:227
Oh :-(
Indeed\, this DOES now allow me to run it on the commandline -- thanks!
In that case\, would it be possible to provide a stripped down test script (with as few external dependencies as possible) that reproduces the issue?
-- "You may not work around any technical limitations in the software" -- Windows Vista license
On Fri Jan 13 04:05:38 2012\, davem wrote:
On Wed\, Jan 11\, 2012 at 04:18:57PM -0800\, Erik Olson wrote:
Unfortunately\, with this new assert added\, I'm back to the "original" assert I was getting on first invocation of debug perl: Perl_mg_get\, mg.c:227
Oh :-(
Indeed\, this DOES now allow me to run it on the commandline -- thanks!
In that case\, would it be possible to provide a stripped down test script (with as few external dependencies as possible) that reproduces the issue?
But he doesnāt know Perl. :-(
I have not been able to reproduce this problem myself.
--
Father Chrysostomos
On Fri\, 13 Jan 2012\, Father Chrysostomos via RT wrote:
On Fri Jan 13 04:05:38 2012\, davem wrote:
On Wed\, Jan 11\, 2012 at 04:18:57PM -0800\, Erik Olson wrote:
Unfortunately\, with this new assert added\, I'm back to the "original" assert I was getting on first invocation of debug perl: Perl_mg_get\, mg.c:227
Oh :-(
Indeed\, this DOES now allow me to run it on the commandline -- thanks!
In that case\, would it be possible to provide a stripped down test script (with as few external dependencies as possible) that reproduces the issue?
But he doesnāt know Perl. :-(
I have not been able to reproduce this problem myself.
But I do know how to hack. So I will see if using the 10\,000-line source wwwsympa script + the Devel::trace output leading to the problem\, if I can pare things down a bit. It may take a while...
- Erik
-- Erik Olson Sent from my spiffy new Linux box
Hello\, FYI\, I'm seeing the same problem\, also with the sympa software\, on a NetBSD 5.1/i386 host running perl 5.14.2. So it looks really like a perl bug\, not something related to the OS or compiler.
Here is what gdb tells me from the core dump:
Core was generated by `perl'.
Program terminated with signal 11\, Segmentation fault.
#0 0xbb734960 in Perl_mg_get ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
(gdb) where
#0 0xbb734960 in Perl_mg_get ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#1 0xbb752b84 in Perl_sv_setsv_flags ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#2 0xbb75d388 in Perl_newSVsv ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#3 0xbb708e9c in S_pad_findlex ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#4 0xbb70950a in Perl_pad_findmy ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#5 0xbb6fbe42 in Perl_yylex ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#6 0xbb705517 in Perl_yyparse ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#7 0xbb777c0a in S_doeval ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#8 0xbb779bc8 in Perl_pp_require ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#9 0xbb742a63 in Perl_runops_standard ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#10 0xbb6df7ff in perl_run ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#11 0x08048db8 in main ()
Hello\, FYI\, I'm seeing the same problem\, also with the sympa software\, on a NetBSD 5.1/i386 host running perl 5.14.2. So it looks really like a perl bug\, not something related to the OS or compiler.
Here is what gdb tells me from the core dump:
Core was generated by `perl'.
Program terminated with signal 11\, Segmentation fault.
#0 0xbb734960 in Perl_mg_get ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
(gdb) where
#0 0xbb734960 in Perl_mg_get ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#1 0xbb752b84 in Perl_sv_setsv_flags ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#2 0xbb75d388 in Perl_newSVsv ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#3 0xbb708e9c in S_pad_findlex ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#4 0xbb70950a in Perl_pad_findmy ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#5 0xbb6fbe42 in Perl_yylex ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#6 0xbb705517 in Perl_yyparse ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#7 0xbb777c0a in S_doeval ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#8 0xbb779bc8 in Perl_pp_require ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#9 0xbb742a63 in Perl_runops_standard ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#10 0xbb6df7ff in perl_run ()
from /usr/pkg/lib/perl5/5.14.0/i386-netbsd-thread-multi/CORE/libperl.so
#11 0x08048db8 in main ()
On Fri Feb 10 07:04:57 2012\, bouyer wrote:
Hello\, FYI\, I'm seeing the same problem\, also with the sympa software\, on a NetBSD 5.1/i386 host running perl 5.14.2. So it looks really like a perl bug\, not something related to the OS or compiler.
Are you able to reproduce it from the command line? If so\, would you be able to try reducing the problem (by deleting big chunks of sympa without eliminating the crash)?
--
Father Chrysostomos
Different person -- bouyer is not me.
"Real" work has swamped me\, so I can't yet devote a chunk of time to chasing this one yet. I will get back to it eventually...
On Mon\, 13 Feb 2012\, Father Chrysostomos via RT wrote:
On Fri Feb 10 07:04:57 2012\, bouyer wrote:
Hello\, FYI\, I'm seeing the same problem\, also with the sympa software\, on a NetBSD 5.1/i386 host running perl 5.14.2. So it looks really like a perl bug\, not something related to the OS or compiler.
Are you able to reproduce it from the command line? If so\, would you be able to try reducing the problem (by deleting big chunks of sympa without eliminating the crash)?
-- Erik Olson Sent from my spiffy new Linux box
Migrated from rt.perl.org#107480 (status was 'open')
Searchable as RT107480$