Closed stephadz closed 6 years ago
We are encountering the same behaviour with nginx + php-fpm 7.0. Backtrace:
#0 zend_mm_alloc_small (bin_num=<optimized out>, size=32, heap=0x7fcb0f000040) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_alloc.c:1318
#1 zend_mm_alloc_heap (size=32, heap=0x7fcb0f000040) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_alloc.c:1389
#2 _emalloc (size=size@entry=32) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_alloc.c:2477
#3 0x00005590d8299a61 in zend_string_alloc (persistent=<optimized out>, len=4) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_string.h:121
#4 zend_string_init (persistent=<optimized out>, len=4, str=0x5590d8351dee "file") at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_string.h:157
#5 _zend_hash_str_update (ht=ht@entry=0x7fcaf7b917e0, str=str@entry=0x5590d8351dee "file", len=len@entry=4, pData=pData@entry=0x7ffdbcf72bb0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_hash.c:665
#6 0x00005590d828eeef in zend_symtable_str_update (pData=0x7ffdbcf72bb0, len=4, str=0x5590d8351dee "file", ht=0x7fcaf7b917e0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_hash.h:431
#7 add_assoc_str_ex (arg=arg@entry=0x7ffdbcf72c30, key=key@entry=0x5590d8351dee "file", key_len=key_len@entry=4, str=str@entry=0x7fcaffe55f30)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_API.c:1379
#8 0x00005590d82a6adf in zend_fetch_debug_backtrace (return_value=return_value@entry=0x7ffdbcf72cb0, skip_last=skip_last@entry=0, options=options@entry=0,
limit=limit@entry=0) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_builtin_functions.c:2570
#9 0x00005590d82ac3eb in zend_default_exception_new_ex (class_type=0x7fcaf7acf1c0, skip_top_traces=0) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_exceptions.c:212
#10 0x00005590d828e9fb in _object_and_properties_init (arg=arg@entry=0x7ffdbcf72d30, class_type=class_type@entry=0x7fcaf7acf1c0, properties=properties@entry=0x0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_API.c:1306
#11 0x00005590d828eb07 in _object_init_ex (arg=arg@entry=0x7ffdbcf72d30, class_type=class_type@entry=0x7fcaf7acf1c0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_API.c:1314
#12 0x00005590d831e567 in ZEND_NEW_SPEC_CONST_HANDLER () at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_vm_execute.h:3378
#13 0x00005590d82cca5b in execute_ex (ex=<optimized out>) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_vm_execute.h:414
#14 0x00005590d827ca93 in zend_call_function (fci=fci@entry=0x7ffdbcf72ef0, fci_cache=0x7fcaff7c4750, fci_cache@entry=0x7ffdbcf72ec0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_execute_API.c:867
#15 0x00005590d81c7f2b in zif_call_user_func_array (execute_data=0x7fcb0f015580, return_value=0x7fcb0f015560)
at /build/php7.0-82TUuU/php7.0-7.0.31/ext/standard/basic_functions.c:4809
#16 0x00005590d831145d in ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER () at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_vm_execute.h:714
#17 0x00005590d82cca5b in execute_ex (ex=<optimized out>) at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_vm_execute.h:414
#18 0x00005590d8321087 in zend_execute (op_array=0x7fcb0f065380, op_array@entry=0x7fcaff902f78, return_value=return_value@entry=0x7fcb0f0154d0)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend_vm_execute.h:458
#19 0x00005590d828c043 in zend_execute_scripts (type=type@entry=8, retval=0x7fcb0f0154d0, retval@entry=0x0, file_count=file_count@entry=3)
at /build/php7.0-82TUuU/php7.0-7.0.31/Zend/zend.c:1445
#20 0x00005590d822af40 in php_execute_script (primary_file=0x7ffdbcf755b0) at /build/php7.0-82TUuU/php7.0-7.0.31/main/main.c:2516
#21 0x00005590d810f434 in main (argc=<optimized out>, argv=<optimized out>) at /build/php7.0-82TUuU/php7.0-7.0.31/sapi/fpm/fpm/fpm_main.c:1967
Downgrading to 1.4.4 resolves this, but since 1.5.x tries to add support for MongoDB 4.0 (which we are using) it sounds a bit risky to downgrade.
I have phpunit tests failing with segfaults and "zend_mm_heap corrupted" errors on every docker container combination of PHP 5.6.35, 5.6.37 and mongodb 1.5.0-1.5.2. Looks like a memory leak but I'll see if I can retrieve a backtrace.
We had a similar issue upgrading mongodb driver to 1.5
I'll try to reproduce it and open a new issue.
I can reliably reproduce the crash using mongodb-1.5.2 (pecl install mongodb-1.5.2
) with our phpunit tests. The backtrace wasn't consistent and just seems to suggest it crashes freeing memory. Valgrind gave me the following output with USE_ZEND_ALLOC=0
:
==332== Invalid write of size 8
==332== at 0xF071C47: php_phongo_field_path_pop (bson.c:164)
==332== by 0xF072143: php_phongo_bson_visit_document (bson.c:956)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF071FF4: php_phongo_bson_visit_document (bson.c:872)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF071FF4: php_phongo_bson_visit_document (bson.c:872)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF071DEC: php_phongo_bson_visit_array (bson.c:989)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF071FF4: php_phongo_bson_visit_document (bson.c:872)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF071FF4: php_phongo_bson_visit_document (bson.c:872)
==332== Address 0xabccda0 is 0 bytes after a block of size 64 alloc'd
==332== at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==332== by 0x4C2AFCF: realloc (vg_replace_malloc.c:692)
==332== by 0x5357A8: ??? (in /usr/local/bin/php)
==332== by 0xF070A35: php_phongo_field_path_ensure_allocation.part.2 (bson.c:124)
==332== by 0xF0710A4: php_phongo_field_path_ensure_allocation (bson.c:143)
==332== by 0xF0710A4: php_phongo_field_path_write_item_at_current_level (bson.c:136)
==332== by 0xF071C1F: php_phongo_field_path_push (bson.c:154)
==332== by 0xF071F6F: php_phongo_bson_visit_document (bson.c:858)
==332== by 0xF01609F: bson_iter_visit_all (bson-iter.c:2144)
==332== by 0xF07225B: php_phongo_bson_to_zval_ex (bson.c:1153)
==332== by 0xF0816DC: php_phongo_cursor_iterator_rewind (Cursor.c:166)
==332== by 0x43B179: spl_iterator_apply (in /usr/local/bin/php)
==332== by 0xF08176A: zim_Cursor_toArray (Cursor.c:309)
@taisph Is there a way how we can try these unit tests too? It looks like it has to do with a specific number of elements in a BSON document read from the server.
@derickr Unfortunately, I can't share the code as it contains a great deal of business logic. I've been trying to isolate the parts necessary to trigger the problem but I haven't had much luck with it yet.
I've tried running the same code with a recompiled PHP 5.6.37 with --enable-debug and get the output below although that doesn't really tell us anything new.
[Tue Sep 4 08:23:08 2018] Script: '/app/vendor/phpunit/phpunit/phpunit'
---------------------------------------
/tmp/pear/temp/mongodb/src/bson.c(110) : Block 0x555557935c08 status:
Beginning: OK (allocated on /tmp/pear/temp/mongodb/src/bson.c:124, 64 bytes)
Start: OK
End: Overflown (magic=0x00000000 instead of 0x59DF37C1)
At least 4 bytes overflown
---------------------------------------
[Tue Sep 4 08:23:08 2018] Script: '/app/vendor/phpunit/phpunit/phpunit'
---------------------------------------
/tmp/pear/temp/mongodb/src/bson.c(113) : Block 0x5555579445f8 status:
Beginning: OK (allocated on /tmp/pear/temp/mongodb/src/bson.c:125, 32 bytes)
Start: OK
End: Overflown (magic=0x00000000 instead of 0x59DF37C1)
At least 4 bytes overflown
---------------------------------------
[Tue Sep 4 08:23:08 2018] Script: '/app/vendor/phpunit/phpunit/phpunit'
---------------------------------------
/tmp/pear/temp/mongodb/src/bson.c(110) : Block 0x555557958398 status:
Beginning: OK (allocated on /tmp/pear/temp/mongodb/src/bson.c:124, 64 bytes)
Start: OK
End: Overflown (magic=0x00000000 instead of 0x59DF37C1)
At least 4 bytes overflown
---------------------------------------
[Tue Sep 4 08:23:08 2018] Script: '/app/vendor/phpunit/phpunit/phpunit'
---------------------------------------
/tmp/pear/temp/mongodb/src/bson.c(113) : Block 0x555557935ff0 status:
Beginning: OK (allocated on /tmp/pear/temp/mongodb/src/bson.c:125, 32 bytes)
Start: OK
End: Overflown (magic=0x00000000 instead of 0x59DF37C1)
At least 4 bytes overflown
---------------------------------------
@taisph Generally, the output of valgrind together with USE_ZEND_ALLOC=0
is a lot more useful that PHP's built-in memory issue detector. As you get a problem report with valgrind, narrowing it down to a single test should be reasonable easy. In order to get enough context information it helps if you add --num-callers=32
to the valgrind
invocation, as I think the valgrind trace doesn't show the whole story right now. It just makes it show more stack frames than the default of 12.
You can use PHPUnit's --debug
method to narrow it down to the right test case. I know the general reason where this bug occurs, but in order to narrow it down and fix it, I'm going to need the exact document and which typemap is being used with it. The smallest case I can envision is:
<?php
$jsonDoc = "{ ... }"; // the document that you're testing with, in JSON form
$bsonDoc = MongoDB\BSON\fromJSON($jsonDoc);
$typeMap = [ … ]; // typemap that you're using;
var_dump(MongoDB\BSON\toPHP($bsonDoc, $typeMap));
?>
I'll continue to think of other ways on how to find the problem, but as always, having a local reproducible case is the best.
@derickr I'm already running valgrind with num-callers=30 and I know exactly which test triggers the problem, but I have not been able to trigger the error reproducing the mongo-related operations isolated.
I did manage to capture a full backtrace which I am poking around in right now.
First two frames:
#0 0x000000000f0e5903 in php_phongo_field_path_pop (field_path=0x1231af60) at /tmp/pear/temp/mongodb/src/bson.c:164
No locals.
#1 0x000000000f0e71bc in php_phongo_bson_visit_document (iter=0xffeffb880, key=0x1229a756 "StreetCode", v_document=0xffeffb680, data=0xffeffb810) at /tmp/pear/temp/mongodb/src/bson.c:956
state = {zchild = 0x1231e5b0, map = {document_type = PHONGO_TYPEMAP_NONE, document = 0x0, array_type = PHONGO_TYPEMAP_NONE, array = 0x0, root_type = PHONGO_TYPEMAP_NONE, root = 0x0, field_paths = {map = 0x0, allocated_size = 0, size = 0}}, odm = 0x0, is_visiting_array = false, field_path = 0x0}
retval = 0x1231e2f0
child = {raw = 0x0, len = 0, off = 4, type = 4, key = 5, d1 = 0, d2 = 0, d3 = 0, d4 = 0, next_off = 0, err_off = 0, value = {value_type = 4278171216, padding = 15, value = {v_oid = {bytes = "\030\347\006\017\000\000\000\000P\266\377\376"}, v_int64 = 252110616, v_int32 = 252110616, v_int8 = 24 '\030', v_double = 1.245591943174745e-315, v_bool = 24, v_datetime = 252110616, v_timestamp = {timestamp = 252110616, increment = 0}, v_utf8 = {str = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", len = 4278171216}, v_doc = {data = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", data_len = 4278171216}, v_binary = {data = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", data_len = 4278171216, subtype = 15}, v_regex = {regex = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", options = 0xffeffb650 "\340\267\377\376\017"}, v_dbpointer = {collection = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", collection_len = 4278171216, oid = {bytes = "\017\000\000\000\005\000\000\000\000\000\000"}}, v_code = {code = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", code_len = 4278171216}, v_codewscope = {code = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", scope_data = 0xffeffb650 "\340\267\377\376\017", code_len = 5, scope_len = 0}, v_symbol = {symbol = 0xf06e718 <bson_utf8_validate+140> "\017\266E\362\204\300u\n\270", len = 4278171216}, v_decimal128 = {low = 252110616, high = 68702680656}}}}
parent_state = 0xffeffb810
Code at mongodb/src/bson.c:164
is:
field_path->elements[field_path->size] = NULL;
Inspecting:
(gdb) frame
#0 0x000000000f0e5903 in php_phongo_field_path_pop (field_path=0x1231af60) at /tmp/pear/temp/mongodb/src/bson.c:164
164 in /tmp/pear/temp/mongodb/src/bson.c
(gdb) p *field_path
$46 = {elements = 0x12337a50, element_types = 0x12332630, allocated_size = 8, size = 8, ref_count = 8, owns_elements = false}
(gdb) p (*field_path)->elements[8]
$47 = 0x0
I'm glad you can reliably reproduce it. Now we "just" need to step so I can too :-)
First some questions:
--filter
)?It is possible to dump the BSON document from within GDB. But for that, you need the GIT repo of the PHP driver cloned. When you are in GDB, in the same place as you are in your latest comment, go up the stack until you hit the php_phongo_bson_to_zval_ex
function. When there, run the following:
source /path/to/cloned/git/repo/src/libmongoc/.gdbinit
printbson b
This should dump the whole BSON document to stdout — you might have to use printbson *b
instead.
If you prefer, we can take this to IRC or Google Hangouts this afternoon for a more hands-on debugging approach.
We use a simple typemap. And it happens regardless of running the test directly or the full suites. For this I am running it directly by specifying the php-file as an argument to phpunit.
The printbson output is below.
The data in the document is from an anonymized test case provided by the remote API vendor so don't worry about it looking real.
(gdb) printbson b
ALLOC [0x144f8030 + 0] (len=2139)
{
'lastErrorObject' : {
'updatedExisting' : false,
'n' : NumberInt("1"),
'upserted' : ObjectID("5B8E4ABE8AF3ED00255D5B00")
},
'value' : {
'_id' : ObjectID("5B8E4ABE8AF3ED00255D5B00"),
'message_id' : "Sess_Id-20150904-133640.0843",
'document_type' : "NotifyMasterDataConsumer",
'status' : null,
'created_date' : UTCDateTime(1441373760000),
'business_process' : "E03",
'ean' : "571313188812345000",
'message_reference' : "ENDK_ATS-2015-09-04T13-36-40.0874",
'session_id' : null,
'delivery_id' : null,
'payload' : {
'HeaderEnergyDocument' : {
'Identification' : "MsgId-DatahubP-20150904-133640.0843",
'DocumentType' : "E21",
'Creation' : "2015-09-04T13:36:00.00Z",
'SenderEnergyParty' : {
'Identification' : "5790001330569"
},
'RecipientEnergyParty' : {
'Identification' : "5790002263057"
}
},
'ProcessEnergyContext' : {
'EnergyBusinessProcess' : "E03",
'EnergyBusinessProcessRole' : "DDQ",
'EnergyIndustryClassification' : "23"
},
'PayloadMasterDataMeteringPointPartyEvent' : {
'Identification' : "Sess_Id-20150904-133640.0843",
'Occurrence' : "2015-09-18T22:00:00.00Z",
'MeteringPointPartyDetailMeteringPointPartyCharacteristic' : {
'ElectricalHeating' : "false",
'WebAccessCode' : "dadj37898912",
'ConsumerCategory' : "119",
'AdministrativePartyMPAdministrativeParty' : [
'0' : {
'Name1' : "Briger Jensen",
'Name2' : "Thea Jessen",
'MPRelationType' : "D01",
'AdministrativePartyAddressLocationAddress' : {
'StreetName' : "Vestergade",
'StreetCode' : {
} 'BuildingNumber' : "23",
'FloorIdentification' : {
} 'RoomIdentification' : {
} 'CitySubDivisionName' : {
} 'Postcode' : "5800",
'CityName' : "Nyborg",
'MunicipalityCode' : {
} 'CountryName' : "DK"
},
'SameAsMPAddress' : "false",
'Phone' : "7672390198",
'Mobile' : "40231375",
'Email' : "b_og_t_je@5800vestergade.dk"
},
'1' : {
'Name1' : {
} 'Name2' : {
} 'MPRelationType' : "D03",
'SameAsMPAddress' : "true"
},
'2' : {
'Name1' : {
} 'Name2' : {
} 'MPRelationType' : "D02",
'SameAsMPAddress' : "true"
}
],
'FirstConsumerConsumerParty' : {
'Name' : "Hans Jensen"
},
'SecondConsumerConsumerParty' : {
'Name' : "Lise Pedersen"
},
'HasBalanceSupplier' : "true",
'SupplyStart' : "2015-09-18T22:00:00.00Z"
},
'MeteringPointDomainLocation' : {
'Identification' : "571313188812345000"
}
}
},
'datahub_message_id' : "MsgId-DatahubP-20150904-133640.0843",
'datahub_session_id' : "Sess_Id-20150904-133640.0843",
'additional_information' : null,
'comment' : null
},
'ok' : 1.000000
}
As far as I understand it, it seems to be the empty object at "StreetCode" that confuses it.
(gdb) p (*field_path)->elements[0]
$49 = 0x1229a2c3 "value"
(gdb) p (*field_path)->elements[1]
$50 = 0x1229a3de "payload"
(gdb) p (*field_path)->elements[2]
$51 = 0x1229a573 "PayloadMasterDataMeteringPointPartyEvent"
(gdb) p (*field_path)->elements[3]
$52 = 0x1229a5fa "MeteringPointPartyDetailMeteringPointPartyCharacteristic"
(gdb) p (*field_path)->elements[4]
$53 = 0x1229a68f "AdministrativePartyMPAdministrativeParty"
(gdb) p (*field_path)->elements[5]
$54 = 0x1229a6bd "0"
(gdb) p (*field_path)->elements[6]
$55 = 0x1229a70c "AdministrativePartyAddressLocationAddress"
(gdb) p (*field_path)->elements[7]
$56 = 0x1229a756 "StreetCode"
(gdb) p (*field_path)->elements[8]
$57 = 0x0
@taisph — I've been able to reproduce this with the example data. It doesn't crash on its own, but valgrind does give the warning, which I guess in some cases leads to a crash. I've created a Jira ticket @ https://jira.mongodb.org/browse/PHPC-1266 to track the work on this.
The full (smallest) case I could manage was:
<?php
$a = <<<ENDJSON
{
"value" : {
"payload" : {
"PayloadMasterDataMeteringPointPartyEvent" : {
"MeteringPointPartyDetailMeteringPointPartyCharacteristic" : {
"AdministrativePartyMPAdministrativeParty" : [
{
"AdministrativePartyAddressLocationAddress" : {
"StreetCode" : {
}
}
}
]
}
}
}
}
}
ENDJSON;
$bson = MongoDB\BSON\fromJSON($a);
var_dump(MongoDB\BSON\toPHP($bson));
And run with:
USE_ZEND_ALLOC=0 valgrind php test.php
This has now been fixed through #924
10 days after what I would call a major bug we have no new release? How much longer do we need to wait for 1.5.3?
As it's only supported with the MongoDB 4.0 according to MongoDB Compatibility matrix: https://docs.mongodb.com/ecosystem/drivers/php/#compatibility
Yes, we can take the fix and patch it and etc, but it's not really as nice as just changing the version number, using patched version requires extra work.
So please, release 1.5.3 or at least some 1.5.2.1 or something that has this bug fixed.
How much longer do we need to wait for 1.5.3?
We're expecting a 1.5.3 release on Wednesday, September 19.
Description
After upgrading mongodb driver from 1.4.4 to 1.5.0 cause segmentation fault on apache docker (1% of requests)
[core:notice] [pid 1] AH00052: child pid 20 exit signal Segmentation fault (11)
The downgrade to 1.4.4 resolve the issue.
Environment
Dockerfile
Test Script
Sorry but we're currently unable to retrieve backtrace or more logs..... We will continue to investigate, and we will post more information later.
Debug Log