openswoole / ext-openswoole

Programmatic server for PHP with async IO, coroutines and fibers
https://openswoole.com
Apache License 2.0
798 stars 48 forks source link

malloc: *** error for object 0x7f9c5ff9a940: pointer being freed was not allocated #15

Closed apinstein closed 2 years ago

apinstein commented 2 years ago

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a simple script for reproducing the error.

I ran into an issue in a real application of mine, and managed to reduce it to a reproducible script:

https://github.com/apinstein/swoole-utils/blob/main/examples/cause-swoole-malloc.php

Should be able to just download that repo, composer install and run php examples/cause-swoole-malloc.php

  1. What did you expect to see?

No crash :)

  1. What did you see instead?

php(93377,0x10b853e00) malloc: error for object 0x7f9c5ff9a940: pointer being freed was not allocated php(93377,0x10b853e00) malloc: set a breakpoint in malloc_error_break to debug zsh: abort php examples/crash.php

Occurs after about 10 minutes.

  1. What version of Swoole are you using (show your php --ri swoole)?
[ git@main ]:☹ 1> php --ri swoole

swoole

Swoole => enabled
Author => Swoole Team <team@swoole.com>
Version => 4.7.1
Built => Aug 27 2021 10:56:54
coroutine => enabled with boost asm context
kqueue => enabled
rwlock => enabled
openssl => OpenSSL 1.1.1l  24 Aug 2021
dtls => enabled
http2 => enabled
pcre => enabled
zlib => 1.2.11
brotli => E16777225/D16777225
async_redis => enabled

Directive => Local Value => Master Value
swoole.enable_coroutine => On => On
swoole.enable_library => On => On
swoole.enable_preemptive_scheduler => Off => Off
swoole.display_errors => On => On
swoole.use_shortname => On => On
swoole.unixsock_buffer_size => 262144 => 262144
  1. What is your machine environment used (show your uname -a & php -v & gcc -v) ?
[ git@main ]:☺ > uname -a      
Darwin Alans-MacBook-Pro-2.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021; root:xnu-7195.141.2~5/RELEASE_X86_64 x86_64

[ git@main ]:☺ > php -v       
PHP 8.0.10 (cli) (built: Aug 27 2021 10:07:52) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.10, Copyright (c) Zend Technologies
    with Zend OPcache v8.0.10, Copyright (c), by Zend Technologies

# This is using a prebuild PHP binary from macports
Build System => Darwin bigsurx.internal.macports.net 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021; root:xnu-7195.141.2~5/RELEASE_X86_64 x86_64

[]:☺ > gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.5 (clang-1205.0.22.11)
Target: x86_64-apple-darwin20.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
apinstein commented 2 years ago

Was able to get it running in VSCode w/lldb and put a breakpoint in malloc_error_break...

I can repro this on-demand now.

image

I also thought it might be related to my using Co::sleep(), which apparently isn't recommended if SWOOLE_HOOK_SLEEP is enabled. It seemed plausible that when using Co::sleep() instead of native sleep meant that this block of could could be skipped in some circumstances:

``` from ext-src/swoole_runtime.cc
    if (Coroutine::get_current()) {
        RETURN_LONG(System::sleep((double) num) < 0 ? num : 0);
    } else {
        RETURN_LONG(php_sleep(num));
    }
```

but the bug still repro's once only native sleep functions are used.

My program routinely fails after 30-120 minutes of runtime. In fact, it has not run longer than that time frame for many months due to this error.

Any ideas appreciated, happy to try to run it down further.

doubaokun commented 2 years ago

@apinstein is it possible to provide a single file which can be used to reproduce the issue?

apinstein commented 2 years ago

I wish I did. At one point I had a script that would cause it but it is not reproducing frequently.

Do you have any suspicions about what the bug might be from the breakpoint and other debug info? That might help me develop a reduced & reproducible test case to share.

Sent from my iPhone

On Oct 23, 2021, at 3:02 PM, Bruce Dou @.***> wrote:

 @apinstein is it possible to provide a single file which can be used to reproduce the issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

doubaokun commented 2 years ago

@apinstein it could be misusing locks with coroutines or other reason the variables is modified unintended, or under layer bugs. If you can provide a single simplest PHP file, it will be helpful to identify the issue.

doubaokun commented 2 years ago

Had a chance to run your script on Ubuntu & OpenSwoole master branch and can't reproduce this issue. Feel free to open this if you can provide a simple and easy script to reproduce.

apinstein commented 2 years ago

Ok thank you

Sent from my iPhone

On Jan 13, 2022, at 5:00 PM, Bruce Dou @.***> wrote:

 Had a chance to run your script on Ubuntu & OpenSwoole master branch and can't reproduce this issue. Feel free to open this if you can provide a simple and easy script to reproduce.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.