Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.95k stars 554 forks source link

Problem killing a pseudo-forked child on Win32 #8889

Open p5pRT opened 17 years ago

p5pRT commented 17 years ago

Migrated from rt.perl.org#42869 (status was 'open')

Searchable as RT42869$

p5pRT commented 17 years ago

From @steve-m-hay

Created by steveh@Mugwump.uk.radan.com

The test program below doesn't work under bleadperl (currently at patchlevel 31123) on Win32. I believe the failure first occurred at patchlevel 29569.

Under perl-5.8.7 it behaves as expected:

Parent 4748 forked child -1980 OK
Parent 4748 killing child -1980...
Parent 4748 exiting...
Child -1980 caught INT: exiting...

but under bleadperl (built in debug mode) the kill fails:

Parent 4404 forked child -2208 OK
Parent 4404 killing child -2208...
Parent 4404 failed to kill child -2208: Invalid argument
Parent 4404 exiting...
Child -2208 exiting...

and under bleadperl built in release mode it crashes too:

Parent 4748 forked child -5624 OK
Parent 4748 killing child -5624...
Parent 4748 failed to kill child -5624: Invalid argument
Parent 4748 exiting...
Child -5624 exiting...
Free to wrong pool 2358a0 not 18347b0 during global destruction.

This problem has been seen and discussed on perl5-porters in the past. I'm just logging it here so that is doesn't get forgotten. See the threads starting here:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2006-12/msg00342.html http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-01/msg00043.html

my $child;

BEGIN {
$child = fork;
die "Fork failed" unless defined $child;
if (!$child) {
$SIG{INT} = sub {
print "Child $$ caught INT: exiting...\n";
exit;
};
sleep 2;
print "Child $$ exiting...\n";
exit;
}
}

print "Parent $$ forked child $child OK\n";
print "Parent $$ killing child $child...\n";
kill 'INT', $child
or print "Parent $$ failed to kill child $child: $!\n";
print "Parent $$ exiting...\n";
exit;
Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.9.5: Configured by steveh at Thu May 3 12:32:58 2007. Summary of my perl5 (revision 5 version 9 subversion 5) configuration: Platform: osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cl', ccflags ='-nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX', optimize='-MD -Zi -DNDEBUG -O1', cppflags='-DWIN32' ccversion='12.00.8804', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -libpath:"c:\perl\lib\CORE" -machine:x86' libpth=C:\PROGRA~1\MICROS~2\VC98\lib libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl59.lib gnulibc_version='' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -libpath:"c:\perl\lib\CORE" -machine:x86' Locally applied patches: DEVEL @INC for perl 5.9.5: C:/perl/lib C:/perl/site/lib . Environment for perl 5.9.5: HOME (unset) LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\perl\bin PERL5LIB=C:\Radview\radcontrol;C:\Radview\radview\cgi-bin\radview PERL_BADLANG (unset) SHELL (unset) -- ```
p5pRT commented 17 years ago

From @iabyn

[I've changed the subject line to match the new bug report filed regarding this old thread]

On Tue\, Jan 16\, 2007 at 09​:04​:23AM +0000\, Steve Hay wrote​:

Dave Mitchell wrote​:

On Mon\, Jan 15\, 2007 at 06​:21​:15PM +0000\, Steve Hay wrote​:

win32_kill(int -3816\, int 2) line 1378 PerlProcKill(IPerlProc * 0x002366f0\, int -3816\, int 2) line 1608 + 13 bytes Perl_apply(interpreter * 0x00234014\, long 286\, sv * * 0x0023cfbc\, sv * * 0x0023cfbc) line 1767 + 30 bytes Perl_pp_chown(interpreter * 0x00234014) line 3484 + 33 bytes Perl_runops_debug(interpreter * 0x00234014) line 1894 + 13 bytes S_run_body(interpreter * 0x00234014\, long 1) line 2402 + 13 bytes perl_run(interpreter * 0x00234014) line 2322 + 13 bytes RunPerl(int 2\, char * * 0x00233fb8\, char * * 0x00235598) line 266 + 12 bytes main(int 2\, char * * 0x00233fb8\, char * * 0x00233028) line 23 + 18 bytes PERL! mainCRTStartup + 227 bytes KERNEL32! 7c816fd7()

Thanks. That function (win32_kill) has a number of pathways which all return 0\, and a default pathway which returns EINVAL. Are you in a position to be able to put a few debugging prints in that function to see what path it's taking to failure?

You can virtually work it out from my previous mail\, actually​: 'pid' is \< 0\, 'child' is >= 0 (in fact\, it is == 0)\, and 'sig' is 2. The only other that you need to know is that hwnd is == INVALID_HANDLE_VALUE. Thus\, we end up with errno = EINVAL and return -1.

Looking at the chunk of code where it decides to set errno = EINVAL in win32_kill() in win32/win32.c\, I see that it has changed quite substantially between 5.8.7 and bleed (mainly by change #26379)​:

5.8.7​:

switch(sig) {
...
default: /* For now be backwards compatible with perl5.6 */
case 9:
if (TerminateProcess(hProcess, sig)) {
remove_dead_process(child);
return 0;
}
break;

blead:

HWND hwnd = w32_pseudo_child_message_hwnds[child];
switch (sig) {
...
default: {
int count = 0;
/* pseudo-process has not yet properly initialized if hwnd isn't set */
while (hwnd == INVALID_HANDLE_VALUE && count < 5) {
/* Yield and wait for the other thread to send us its message_hwnd */
Sleep(0);
win32_async_check(aTHX);
++count;
}
if (hwnd != INVALID_HANDLE_VALUE) {
/* We fake signals to pseudo-processes using Win32
* message queue. In Win9X the pids are negative already. */
if ((hwnd != NULL && PostMessage(hwnd, WM_USER_KILL, sig, 0)) ||
PostThreadMessage(IsWin95() ? pid : -pid, WM_USER_KILL, sig, 0))
{
/* It might be us ... */
PERL_ASYNC_CHECK();
return 0;
}
}
break;
}

This seems to relate to testing whether the thread you're sending a signal to has fully initialised itself yet. Since the test code in question is very likely to be in that situation (fork\, then the parent *immediately* sends an INT to the child)\, this looks interesting.

I then see that the loop which waits for the thread to initialise *will never succeed*\, since it tests hwnd each time round the loop\, but never updates that variable. Presumably it needs a

  hwnd = w32_pseudo_child_message_hwnds[child];

within the loop.

Whether this would fix the problem of kill() returning "Invalid argument"\, I don't know. I also don't know whether the "free to wrong pool" error in the non-debug build is an independent issue.

Dave

-- "The GPL violates the U.S. Constitution\, together with copyright\, antitrust and export control laws"   -- SCO smoking crack again.

p5pRT commented 17 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 17 years ago

From @nwc10

On Thu\, May 03\, 2007 at 10​:39​:23PM +0100\, Dave Mitchell wrote​:

Looking at the chunk of code where it decides to set errno = EINVAL in win32_kill() in win32/win32.c\, I see that it has changed quite substantially between 5.8.7 and bleed (mainly by change #26379)​:

IIRC that change is binary incompatible\, because it adds a member​:

@@ -388,10 +408,11 @@
child_tab * children;
#ifdef USE_ITHREADS
DWORD pseudo_id;
- child_tab * pseudo_children;
+ pseudo_child_tab * pseudo_children;
#endif
void * internal_host;
struct thread_intern thr_intern;
+ HWND message_hwnd;
UINT timerid;
unsigned poll_count;
Sighandler_t sigtable[SIG_SIZE];

in the middle of a structure in a public header file.

Nicholas Clark

p5pRT commented 17 years ago

From @steve-m-hay

Dave Mitchell wrote​:

[I've changed the subject line to match the new bug report filed regarding this old thread]

I then see that the loop which waits for the thread to initialise *will never succeed*\, since it tests hwnd each time round the loop\, but never updates that variable. Presumably it needs a

hwnd = w32\_pseudo\_child\_message\_hwnds\[child\];

within the loop.

Whether this would fix the problem of kill() returning "Invalid argument"\, I don't know. I also don't know whether the "free to wrong pool" error in the non-debug build is an independent issue.

It does indeed fix the kill() problem (and all the tests still pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

--

p5pRT commented 17 years ago

From @jandubois

On Thu\, 10 May 2007\, Steve Hay wrote​:

Dave Mitchell wrote​:

[I've changed the subject line to match the new bug report filed regarding this old thread]

I then see that the loop which waits for the thread to initialise *will never succeed*\, since it tests hwnd each time round the loop\, but never updates that variable. Presumably it needs a

hwnd = w32\_pseudo\_child\_message\_hwnds\[child\];

within the loop.

Whether this would fix the problem of kill() returning "Invalid argument"\, I don't know. I also don't know whether the "free to wrong pool" error in the non-debug build is an independent issue.

It does indeed fix the kill() problem (and all the tests still pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

The error happens because the memory freed by line 823 in perl.c has not been allocated from the shared heap​:

  CopFILE_free(&PL_compiling);

This happens in the child "process"\, not the main one.

Currently this looks like an optimizer bug to me​: When I compile with -Od instead of -O1 the problem goes away. And the various values displayed by the debugger when using the -O1 binary don't make much sense to me. :(

Cheers\, -Jan

p5pRT commented 17 years ago

From @jandubois

On Thu\, 10 May 2007\, Steve Hay wrote​:

It does indeed fix the kill() problem (and all the tests still pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

I just looked at this again and realized that you are calling fork() inside a BEGIN block. I thought creating new threads from inside BEGIN blocks was not supported. Or has this been fixed and is supposed to work now in bleadperl?

Cheers\, -Jan

p5pRT commented 17 years ago

From @jdhedden

Steve Hay wrote​:

It does indeed fix the kill() problem (and all the tests still pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

Jan Dubois wrote​:

I just looked at this again and realized that you are calling fork() inside a BEGIN block. I thought creating new threads from inside BEGIN blocks was not supported. Or has this been fixed and is supposed to work now in bleadperl?

Creating threads in BEGIN blocks does work\, although in some cases it produces innocuous scalar leaked messages.

p5pRT commented 17 years ago

From @crucially

On May 11\, 2007\, at 6​:50 PM\, Jerry D. Hedden wrote​:

Steve Hay wrote​:

It does indeed fix the kill() problem (and all the tests still
pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

Jan Dubois wrote​:

I just looked at this again and realized that you are calling fork() inside a BEGIN block. I thought creating new threads from inside BEGIN blocks was not supported. Or has this been fixed and is
supposed to work now in bleadperl?

Creating threads in BEGIN blocks does work\, although in some cases it produces innocuous scalar leaked messages.

Does it reliably work in windows which has different memory pools for
different threads?

Artur

p5pRT commented 17 years ago

From @steve-m-hay

Jan Dubois wrote​:

On Thu\, 10 May 2007\, Steve Hay wrote​:

It does indeed fix the kill() problem (and all the tests still pass)\, so I've applied that as #31188. Thanks!

The "Free to wrong pool" error remains\, though\, so I'll leave the bug report open.

I just looked at this again and realized that you are calling fork() inside a BEGIN block. I thought creating new threads from inside BEGIN blocks was not supported. Or has this been fixed and is supposed to work now in bleadperl?

I don't know what's supposed to work and what isn't\, but the warning in perlfork currently just says that the forked copy will not continue after the BEGIN block (which kind of implies that it does work apart from that limitation)​:

http​://search.cpan.org/~rgarcia/perl-5.9.4/pod/perlfork.pod#CAVEATS_AND_LIMITATIONS

Of course\, in my particular test script\, the child isn't even trying to continue after the BEGIN block\, so either it should work or else the docs should be updated.

--

p5pRT commented 17 years ago

From @jandubois

On Sat\, 05 May 2007\, Nicholas Clark wrote​:

On Thu\, May 03\, 2007 at 10​:39​:23PM +0100\, Dave Mitchell wrote​:

Looking at the chunk of code where it decides to set errno = EINVAL in win32_kill() in win32/win32.c\, I see that it has changed quite substantially between 5.8.7 and bleed (mainly by change #26379)​:

IIRC that change is binary incompatible\, because it adds a member​:

That is indeed true. I don't think any code outside the core should access the individual members of PL_sys_intern\, so the changes to that structure don't really matter.

A bit more worrying is that any changes to PL_sys_intern will also change the layout of the variables in intrpvar.h\, as PL_sys_intern is included directly as a structure\, and not as a pointer to an additional struct.

However\, this is only a problem when you compile without MULTIPLICITY; otherwise all variables will be accessed by non PERL_CORE code through functions like this​:

  #define PL_sys_intern (*Perl_Isys_intern_ptr(aTHX))

and the in-memory layout of the variables doesn't matter.

So I guess my question is​: should we make PL_sys_intern a pointer to allow for easier maintenance? With the current layout it is impossible to change the PL_sys_intern struct in the maintenance branch if you care about the non-MULTIPLICITY case.

Cheers\, -Jan

@​@​ -388\,10 +408\,11 @​@​ child_tab * children; #ifdef USE_ITHREADS DWORD pseudo_id; - child_tab * pseudo_children; + pseudo_child_tab * pseudo_children; #endif void * internal_host; struct thread_intern thr_intern; + HWND message_hwnd; UINT timerid; unsigned poll_count; Sighandler_t sigtable[SIG_SIZE];

in the middle of a structure in a public header file.

Nicholas Clark

p5pRT commented 17 years ago

From @jandubois

On Tue\, 15 May 2007\, Jan Dubois wrote​:

However\, this is only a problem when you compile without MULTIPLICITY; otherwise all variables will be accessed by non PERL_CORE code through functions like this​:

\#define PL\_sys\_intern           \(\*Perl\_Isys\_intern\_ptr\(aTHX\)\)

and the in-memory layout of the variables doesn't matter.

However\, when not using MULTIPLICITY all symbols are resolved by the loader at runtime and not by the compiler as struct offsets. Doesn't that mean that the old rule to only add new elements to intrpvar.h no longer applies? Or do I just need more sleep/coffee?

Cheers\, -Jan

p5pRT commented 11 years ago

From @bulk88

On Thu May 10 18​:53​:53 2007\, jdb wrote​:

The error happens because the memory freed by line 823 in perl.c has not been allocated from the shared heap​:

CopFILE\_free\(&PL\_compiling\);

This happens in the child "process"\, not the main one.

Currently this looks like an optimizer bug to me​: When I compile with -Od instead of -O1 the problem goes away. And the various values displayed by the debugger when using the -O1 binary don't make much sense to me. :(

Cheers\, -Jan

In 5.17.6 running\, in -Od win32 perl:

#!/usr/bin/perl -w
#use Data::Dumper;
use warnings;
#use strict;
#use Devel::Peek qw(Dump);
#use B;

my $child;

BEGIN {
$child = fork;
die "Fork failed" unless defined $child;
if (!$child) {
$SIG{INT} = sub {
print "Child $$ caught INT: exiting...\n";
exit;
};
sleep 2;
print "Child $$ exiting...\n";
exit;
}
}

print "Parent $$ forked child $child OK\n";
print "Parent $$ killing child $child...\n";
kill 'INT', $child
or print "Parent $$ failed to kill child $child: $!\n";
print "Parent $$ exiting...\n";
exit;

results in the same crash

if (destruct_level == 0) {

DEBUG_P(debprofdump());

#if defined(PERLIO_LAYERS)
/* No more IO - including error messages ! */
PerlIO_cleanup(aTHX);
#endif

>>>>>>>>>>>>>>>>> CopFILE_free(&PL_compiling);

/* The exit() function will do everything that needs doing. */
return STATUS_EXIT;
}

Child thread\, (crashed)

Child thread, (crashed)
____________________________________________________________________
perl517.dll!VMem::Free(void * pMem=0x0096496c) Line 202 + 0x3 C
perl517.dll!CPerlHost::FreeShared(void * ptr=0x0096496c) Line 105 C
perl517.dll!PerlMemSharedFree(IPerlMem * piPerl=0x0090ac70, void *
ptr=0x0096496c) Line 364 C
> perl517.dll!perl_destruct(interpreter * my_perl=0x00934c0c) Line
827 + 0x20 C
perl517.dll!win32_start_child(void * arg=0x00934c0c) Line 1795 + 0x9 C
kernel32.dll!_BaseThreadStart@8() + 0x37
____________________________________________________________________

parent thread
____________________________________________________________________
ntdll.dll!_KiFastSystemCallRet@0()
ntdll.dll!_ZwWaitForMultipleObjects@20() + 0xc
kernel32.dll!_WaitForMultipleObjectsEx@20() - 0x48
kernel32.dll!_WaitForMultipleObjects@16() + 0x18
> perl517.dll!win32_wait_for_children(interpreter * my_perl=0x00345ecc)
Line 1224 C
perl517.dll!perl_destruct(interpreter * my_perl=0x00345ecc) Line
535 + 0x9 C
perl517.dll!RunPerl(int argc=2, char * * argv=0x00342478, char * *
env=0x003451f0) Line 274 + 0x9 C
perl.exe!mainCRTStartup() Line 398 + 0xe C
kernel32.dll!_BaseProcessStart@4() + 0x23

I think this (fork in a BEGIN) is the last remaining issue in this thread since 5.17 is printing "Child -**** caught INT​: exiting..." to console now. The interp struct layout stuff is meaningless now since the interp struct member abstraction/ABI compatibility functions stuff was removed in 5.14.

Console output upto the crash on 5.17.6

Parent 2528 forked child -2392 OK
Parent 2528 killing child -2392...
Parent 2528 exiting...
Child -2392 caught INT: exiting...
Free to wrong pool 345d90 not 90a048 at C:\Documents and
Settings\Owner\Desktop\
42869.pl line 22 during global destruction.

-- bulk88 ~ bulk88 at hotmail.com

toddr commented 4 years ago

This thread has stalled and it looks like it is no longer related to the ticket topic. Should we close and open a new issue? IMO, fork during BEGIN just... sounds like a bad idea. new threads on windows during BEGIN sounds insane

jdhedden commented 4 years ago

Close it

On Wed, Feb 12, 2020, 09:36 Todd Rinaldo notifications@github.com wrote:

This thread has stalled and it looks like it is no longer related to the ticket topic. Should we close an open a new issue? IMO, fork during BEGIN just... sounds like a bad idea. new threads on windows during BEGIN sounds insane

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/8889?email_source=notifications&email_token=ABL4FW42RKDEZKAD3NFFAWLRCQCN5A5CNFSM4KT26EZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQ7SQY#issuecomment-585234755, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABL4FWYMB6NY3ZPYDUPE63DRCQCN5ANCNFSM4KT26EZQ .