Closed feamsr00 closed 5 months ago
Killing the whole process would likely indicate a segfault or some other similar issue.
Seems like you're right! The code above produced an Process exited. Exit code - 0xc0000005
[an Access Violation, but no dump] and the code below...
<<<main test.php>>>
<?php
namespace JobTest;
use parallel\Runtime as Thread_Runtime;
use parallel\Channel as Thread_Channel;
use Closure;
const DEFAULT_CHANNEL_CAPACITY = 1024;
const THREAD_ENVIRONMENT_STUB = __DIR__.DIRECTORY_SEPARATOR.'ThreadBooter.php';
require_once __DIR__.DIRECTORY_SEPARATOR.'ReciprocalPropertyTestClasses.php';
class Agent
{
protected $hThreadRuntime = False;
public function __construct()
{
if (!$this->hThreadRuntime)
{
$this->hThreadRuntime=new Thread_Runtime(THREAD_ENVIRONMENT_STUB);
}
$this->cExecute = function ($oObj)
{
error_reporting(E_ALL);
ini_set('error_log', __DIR__.DIRECTORY_SEPARATOR.'thread-engine.log');
error_log("init");
};
}
public function run($oObj)
{
$oResult = $this->hThreadRuntime->run($this->cExecute, [$oObj]);
return $oResult;
}
}
$oObj1 = new \Obj1;
$oObj2 = new \Obj2;
$oObj1->set($oObj2);
$oAgent = new Agent;
print "Run";
$hThreadFuture = $oAgent->run($oObj1);
var_dump($hThreadFuture->value());
print "Not reachable";
<<<ReciprocalPropertyTestClasses.php>>>
<?php
class Obj1
{
public function set(Obj2 $obj)
{
$obj->set($this);
$this->obj = $obj;
}
}
class Obj2
{
public function set(Obj1 $obj)
{
$this->obj = $obj;
}
}
produced a stack overflow exception:
***********************
* EXCEPTION DETAILS *
***********************
DetailID = 1
Count: 1
Exception #: 0XC00000FD
Stack:
Thread 0 - System ID 15484
Entry point php!sapi_cli_single_write+a000
Create time 7/20/2019 1:20:49 AM
Time spent in user mode 0 Days 00:00:00.531
Time spent in kernel mode 0 Days 00:00:00.359
php_parallel!get_module+0x26d0
php_parallel!get_module+0x2765
php_parallel!get_module+0x2765
php_parallel!get_module+0x2765
php_parallel!get_module+0x2765
php_parallel!get_module+0x2765
<///snip///>
Thread 1 - System ID 16084
Entry point ntdll!TppWorkerThread
Create time 7/20/2019 1:20:49 AM
Time spent in user mode 0 Days 00:00:00.015
Time spent in kernel mode 0 Days 00:00:00.046
Instruction Address
[0x7ff80b4430d4] ntdll!NtWaitForWorkViaWorkerFactory+14
[0x7ff80b3f01f4] ntdll!TppWorkerThread+274
[0x7ff80a147974] kernel32!BaseThreadInitThunk+14
[0x7ff80b40a271] ntdll!RtlUserThreadStart+21
Thread 2 - System ID 14464
Entry point ntdll!TppWorkerThread
Create time 7/20/2019 1:20:49 AM
Time spent in user mode 0 Days 00:00:00.015
Time spent in kernel mode 0 Days 00:00:00.000
Instruction Address
[0x7ff80b4430d4] ntdll!NtWaitForWorkViaWorkerFactory+14
[0x7ff80b3f01f4] ntdll!TppWorkerThread+274
[0x7ff80a147974] kernel32!BaseThreadInitThunk+14
[0x7ff80b40a271] ntdll!RtlUserThreadStart+21
Thread 3 - System ID 14112
Entry point ntdll!TppWorkerThread
Create time 7/20/2019 1:20:49 AM
Time spent in user mode 0 Days 00:00:00.000
Time spent in kernel mode 0 Days 00:00:00.000
Instruction Address
[0x7ff80b4430d4] ntdll!NtWaitForWorkViaWorkerFactory+14
[0x7ff80b3f01f4] ntdll!TppWorkerThread+274
[0x7ff80a147974] kernel32!BaseThreadInitThunk+14
[0x7ff80b40a271] ntdll!RtlUserThreadStart+21
Thread 4 - System ID 9396
Entry point ucrtbase!thread_start
Create time 7/20/2019 1:20:49 AM
Time spent in user mode 0 Days 00:00:00.000
Time spent in kernel mode 0 Days 00:00:00.015
This thread is not fully resolved and may or may not be a problem. Further analysis of these threads may be required.
Instruction Address
[0x7ff80b4402b4] ntdll!NtWaitForMultipleObjects+14
[0x7ff8080dcc7e] KERNELBASE!WaitForMultipleObjectsEx+fe
[0x7ff8080dcb6e] KERNELBASE!WaitForMultipleObjects+e
[0x7fffeb286757] pthreadVC2!ptw32_cancelable_wait+67
[0x7fffeb288cf4] pthreadVC2!sem_timedwait+f4
[0x7fffeb286b78] pthreadVC2!ptw32_cond_timedwait+208
[0x7fffe59396cd] php_parallel!get_module+85bd
[0x7fffe593b3e6] php_parallel!get_module+a2d6
[0x7fffeb2883cc] pthreadVC2!ptw32_threadStart+17c
[0x7ff8076b1ffa] ucrtbase!thread_start<unsigned int (__cdecl*)(void * __ptr64)>+3a
[0x7ff80a147974] kernel32!BaseThreadInitThunk+14
[0x7ff80b40a271] ntdll!RtlUserThreadStart+21
Let me know if I should open a second issue.
I've not needed to collect stacks from php before so I've followed (with limited success) the steps here: https://bugs.php.net/bugs-generating-backtrace-win32.php. Whats the next step?
I tried the preview version of windbg and I think I was able to tease out the trace for the access violation crash....
[0x0] php7ts!zend_hash_find + 0xa
[0x1] php_parallel!get_module + 0x2079
[0x2] php_parallel!get_module + 0x9873
[0x3] php_parallel!get_module + 0x955f
[0x4] php_xdebug_2_7_2_7_3_vc15_x86_64 + 0x71f2
[0x5] php7ts!destroy_op_array + 0x7da
[0x6] php7ts!execute_ex + 0x5f
[0x7] php_pthreads!get_module + 0x42
[0x8] php_xdebug_2_7_2_7_3_vc15_x86_64 + 0x6a20
[0x9] php7ts!libiconv_set_relocation_prefix + 0x3136
[0xa] php7ts!execute_ex + 0x5f
[0xb] php_pthreads!get_module + 0x42
[0xc] php_xdebug_2_7_2_7_3_vc15_x86_64 + 0x6a20
[0xd] php7ts!zend_execute + 0x1a8
[0xe] php7ts!zend_execute_scripts + 0xb9
[0xf] php7ts!php_execute_script + 0x261
[0x10] php!sapi_cli_single_write + 0x1320
[0x11] php!sapi_cli_single_write + 0x2298
[0x12] php!sapi_cli_single_write + 0x9f88
[0x13] KERNEL32!BaseThreadInitThunk + 0x14
[0x14] ntdll!RtlUserThreadStart + 0x21
I think I got a disassembly of faulting instruction... (x64aa)
php7ts!zend_hash_find:
00007ffb`46b364a0 48896c2418 mov qword ptr [rsp+18h], rbp
00007ffb`46b364a5 57 push rdi
00007ffb`46b364a6 4883ec20 sub rsp, 20h
00007ffb`46b364aa 4c8b4208 mov r8, qword ptr [rdx+8] ds:00000000`00000018=????????????????
00007ffb`46b364ae 488bfa mov rdi, rdx
00007ffb`46b364b1 4c8bc9 mov r9, rcx
00007ffb`46b364b4 4d85c0 test r8, r8
00007ffb`46b364b7 0f85cb000000 jne php7ts!zend_hash_find+0xe8 (00007ffb`46b36588)
Is this helpful at all?
According to the documentation, it may not be possible to pass objects as argument https://www.php.net/manual/en/parallel-runtime.run.php
I need a minimal, complete, simple, reproducing test case to investigate.
I've got the same issue with amphp's parallel when using the parallel extension: Sudden SIGSEGV on parallel\Runtime::run(). I'm trying to build something to reproduce.
This seems to be some complex edge case, maybe in combination with a specific extension and current state. I'm not quite sure how to build something to reproduce this behavior. I'll let you know if I can dig up something when I got some spare time. If someone finds a way in the meantime it would be great 👍
I'll reinvestigate soon, I had to re-engineer my project to allow process models. In the meantime, has anyone with parallel installed managed to copy & paste the code I left in the 3rd comment at the top of this issue? It goes in 2 files and depends only on parallel...
Run that with the debug version of php and I see no reason why you shouldn't get the same trace (unless of course it is an environmental issue but I tried that on 2 workstations and a server and they all dumped...)
Try extension with #139 PR, at current moment can't test your scripts. Also, didnt test it on windows
EDIT: Seems like the author already knew about this, see #64
I'll reinvestigate soon, I had to re-engineer my project to allow process models. In the meantime, has anyone with parallel installed managed to copy & paste the code I left in the 3rd comment at the top of this issue? It goes in 2 files and depends only on parallel...
Run that with the debug version of php and I see no reason why you shouldn't get the same trace (unless of course it is an environmental issue but I tried that on 2 workstations and a server and they all dumped...)
I like this test case a lot. I haven't investigated thoroughly, but I highly suspect it has to do with how objects are copied over. Pure speculation for now...
obj1 points to obj2 which has a pointer to obj1
i.e.,
obj1['obj']->obj2 and obj2['obj'] -> obj1
So when a pass-by-value is attempted, you hit a stack overflow if you naively clone objects. E.g., var_dump on obj1 and you'll see PHP knows what it's getting itself into.
I think what needs to happen is to use a reference-table/mapping when doing the copy.
E.g.,
int *Obj1Ptr = &Obj1;
Obj1New = (shallow clone) Obj1
InternalMap[Obj1Ptr] = &Obj1New;
Obj2New = (shallow clone) Obj2
if (InternalMap[&Obj2->Obj]) {
Obj2New->Obj = InternalMap[&Obj2->Obj];
}
I know user land doesn't get to work with this stuff, but I'm not sure where there is handled in the extension.
I'm probably way over my head here, but I also don't see how #139 resolves this
This causes overflow, see #64
[0x1] php_parallel!get_module + 0x2079 […] [0x7] php_pthreads!get_module + 0x42
Running parallel and pthreads simultaneously might not be the best idea.
[0x1] php_parallel!get_module + 0x2079 […] [0x7] php_pthreads!get_module + 0x42
Running parallel and pthreads simultaneously might not be the best idea.
they didnt confilict in internals if you dont start using them both.
Fixed in 67413b193ecda80a32335a76eb25454a4d54bb32
Hello,
Ok, so I've seen some odd error states show up when running some recent thread tests. Under certain conditions we were faced with a script that died in the middle of execution with no indication of what killed it, or for.
This seems to fall over right inside ::run(). This blows the whole process away, not just the threads. Is Parallel swallowing engine errors? Is it possible for Parallel to full emit any engine error encountered? Is this related to #72 ?
PS Crazy thought, this isn't related to the object typehint for cExecute... is it?