CESNET / libyang

YANG data modeling language library
BSD 3-Clause "New" or "Revised" License
364 stars 291 forks source link

lyd_print_fd accesses previously unlinked and deallocated branches #238

Closed milanlenco closed 7 years ago

milanlenco commented 7 years ago

In sysrepo, to validate data tree with cross module dependencies, we temporarily append copies of data of referenced modules using lyd_merge, which, immediately after the validation is finished, get pruned away using this function so that we obtain back the original data tree. However, after the recent changes in libyang there seems to be an issue with printing of such temporarily-expanded trees as reported by valgrind:

32008 ==28783== Invalid read of size 2
32009 ==28783==    at 0x5D2AE34: xml_print_leaf (printer_xml.c:277)
32010 ==28783==    by 0x5D2B81A: xml_print_node (printer_xml.c:462)
32011 ==28783==    by 0x5D2BAE4: xml_print_data (printer_xml.c:522)
32012 ==28783==    by 0x5D0B3FE: lyd_print_ (printer.c:307)
32013 ==28783==    by 0x5D0B518: lyd_print_fd (printer.c:345)
32014 ==28783==    by 0x4D67A2: dm_commit_write_files (data_manager.c:3215)
32015 ==28783==    by 0x4B5183: rp_dt_commit (rp_dt_edit.c:840)
32016 ==28783==    by 0x47D583: rp_commit_req_process (request_processor.c:1493)
32017 ==28783==    by 0x48F8E6: rp_req_dispatch (request_processor.c:3192)
32018 ==28783==    by 0x490CFE: rp_msg_dispatch (request_processor.c:3372)
32019 ==28783==    by 0x491680: rp_worker_thread_execute (request_processor.c:3472)
32020 ==28783==    by 0x54596F9: start_thread (pthread_create.c:333)
32021 ==28783==  Address 0xaad1780 is 64 bytes inside a block of size 72 free'd
32022 ==28783==    at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
32023 ==28783==    by 0x5D06F8D: lyd_free (tree_data.c:4347)
32024 ==28783==    by 0x5D06DD6: lyd_free (tree_data.c:4304)
32025 ==28783==    by 0x4C680E: dm_remove_added_data_trees (data_manager.c:1546)
32026 ==28783==    by 0x4CEF16: dm_validate_session_data_trees (data_manager.c:2350)
32027 ==28783==    by 0x4B4D09: rp_dt_commit (rp_dt_edit.c:818)
32028 ==28783==    by 0x47D583: rp_commit_req_process (request_processor.c:1493)
32029 ==28783==    by 0x48F8E6: rp_req_dispatch (request_processor.c:3192)
32030 ==28783==    by 0x490CFE: rp_msg_dispatch (request_processor.c:3372)
32031 ==28783==    by 0x491680: rp_worker_thread_execute (request_processor.c:3472)
32032 ==28783==    by 0x54596F9: start_thread (pthread_create.c:333)
32033 ==28783==    by 0x606BB5C: clone (clone.S:109)
32034 ==28783==  Block was alloc'd at
32035 ==28783==    at 0xexternal trees4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
32036 ==28783==    by 0x5D0623D: lyd_dup (tree_data.c:4010)
32037 ==28783==    by 0x5D017D2: lyd_merge (tree_data.c:1994)
32038 ==28783==    by 0x4C62FC: dm_append_data_tree (data_manager.c:1505)
32039 ==28783==    by 0x4C6DDA: dm_load_dependant_data (data_manager.c:1583)
32040 ==28783==    by 0x4CEB15: dm_validate_session_data_trees (data_manager.c:2336)
32041 ==28783==    by 0x4B4D09: rp_dt_commit (rp_dt_edit.c:818)
32042 ==28783==    by 0x47D583: rp_commit_req_process (request_processor.c:1493)
32043 ==28783==    by 0x48F8E6: rp_req_dispatch (request_processor.c:3192)
32044 ==28783==    by 0x490CFE: rp_msg_dispatch (request_processor.c:3372)
32045 ==28783==    by 0x491680: rp_worker_thread_execute (request_processor.c:3472)
32046 ==28783==    by 0x54596F9: start_thread (pthread_create.c:333)

It seems that there are some pointers left that point from the original tree to the temporarily added branches from referenced modules. I was thinking, that maybe it would be better if you have provided us with a function that would do something like our dm_remove_added_data_trees, so that you can guard it with your unit tests and keep it in-sync with whatever changes will get introduced to libyang in the future (that will change how pointers are handled etc.).

michalvasko commented 7 years ago

Hi, well, this is exactly the case of printing non-validated data with leafrefs, so validate the tree that is printed to solve the issue. Anyway, I will try to account for this situation in printers so that no invalid memory is accessed and I will remove the invalid link when setting the node's invalid flag, but this behavior was expected.

Regards, Michal

michalvasko commented 7 years ago

Hi, it should be possible to print such data without invalid reads.

milanlenco commented 7 years ago

Hi, thanks for helping us to find a working solution for both sides.

The next problem as we continue with fixing tests is that lyd_print_fd ends in a never-ending cycle:

(gdb) thread apply all bt

Thread 3 (Thread 0x7f1edcf74700 (LWP 14837)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000000000430a16 in watchdog_routine (arg=0x0) at /home/mlenco/dev/sysrepo/tests/helpers/system_helper.c:386
#2  0x00007f1ede3b86fa in start_thread (arg=0x7f1edcf74700) at pthread_create.c:333
#3  0x00007f1edd7f8b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7f1edb771700 (LWP 15099)):
#0  lyd_leaf_type (leaf=0x7f1ebc010de0) at /home/mlenco/dev/libyang/src/tree_data.c:5908
#1  0x00007f1eddb5c27e in xml_print_leaf (out=0x7f1edb770b70, level=1, node=0x7f1ebc010de0, toplevel=1, options=3) at /home/mlenco/dev/libyang/src/printer_xml.c:282
#2  0x00007f1eddb5cc48 in xml_print_node (out=0x7f1edb770b70, level=1, node=0x7f1ebc010de0, toplevel=1, options=3) at /home/mlenco/dev/libyang/src/printer_xml.c:462
#3  0x00007f1eddb5cf12 in xml_print_data (out=0x7f1edb770b70, root=0x7f1ebc010de0, options=3) at /home/mlenco/dev/libyang/src/printer_xml.c:522
#4  0x00007f1eddb3c82c in lyd_print_ (out=0x7f1edb770b70, root=0x7f1ebc010de0, format=LYD_XML, options=3) at /home/mlenco/dev/libyang/src/printer.c:307
#5  0x00007f1eddb3c946 in lyd_print_fd (fd=55, root=0x7f1ebc010de0, format=LYD_XML, options=3) at /home/mlenco/dev/libyang/src/printer.c:345
#6  0x00000000004d67a3 in dm_commit_write_files (session=0x7f1ed40013b0, c_ctx=0x7f1ebc010d30) at /home/mlenco/dev/sysrepo/src/data_manager.c:3216
#7  0x00000000004b5184 in rp_dt_commit (rp_ctx=0x1090d80, session=0x7f1ed40011a0, c_ctx=0x0, errors=0x7f1edb770cc0, err_cnt=0x7f1edb770cc8) at /home/mlenco/dev/sysrepo/src/rp_dt_edit.c:840
#8  0x000000000047d584 in rp_commit_req_process (rp_ctx=0x1090d80, session=0x7f1ed40011a0, msg=0x7f1ecc0036b8, skip_msg_cleanup=0x7f1edb770d63) at /home/mlenco/dev/sysrepo/src/request_processor.c:1493
#9  0x000000000048f8e7 in rp_req_dispatch (rp_ctx=0x1090d80, session=0x7f1ed40011a0, msg=0x7f1ecc0036b8, skip_msg_cleanup=0x7f1edb770d63) at /home/mlenco/dev/sysrepo/src/request_processor.c:3192
#10 0x0000000000490cff in rp_msg_dispatch (rp_ctx=0x1090d80, session=0x7f1ed40011a0, msg=0x7f1ecc0036b8) at /home/mlenco/dev/sysrepo/src/request_processor.c:3372
#11 0x0000000000491681 in rp_worker_thread_execute (rp_ctx_p=0x1090d80) at /home/mlenco/dev/sysrepo/src/request_processor.c:3472
#12 0x00007f1ede3b86fa in start_thread (arg=0x7f1edb771700) at pthread_create.c:333
#13 0x00007f1edd7f8b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f1edede6880 (LWP 14836)):
#0  0x00007f1ede3b99cd in pthread_join (threadid=139770507761408, thread_return=0x0) at pthread_join.c:90
#1  0x00000000004930a2 in rp_cleanup (rp_ctx=0x1090d80) at /home/mlenco/dev/sysrepo/src/request_processor.c:3750
#2  0x0000000000568a97 in cm_cleanup (cm_ctx=0x10c1500) at /home/mlenco/dev/sysrepo/src/connection_manager.c:1872
#3  0x000000000051f763 in sr_disconnect (conn_ctx=0x10c2ba0) at /home/mlenco/dev/sysrepo/src/clientlib/client_library.c:474
#4  0x0000000000416a30 in sysrepo_teardown (state=0x10889f0) at /home/mlenco/dev/sysrepo/tests/cl_test.c:76
#5  0x00007f1ede7dc1f9 in cmocka_run_one_test_or_fixture () from /usr/local/lib/libcmocka.so.0
#6  0x00007f1ede7dc529 in cmocka_run_one_tests () from /usr/local/lib/libcmocka.so.0
#7  0x00007f1ede7dc850 in _cmocka_run_group_tests () from /usr/local/lib/libcmocka.so.0
#8  0x000000000042d4d6 in main () at /home/mlenco/dev/sysrepo/tests/cl_test.c:5544
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f1edb771700 (LWP 15099))]
#0  lyd_leaf_type (leaf=0x7f1ebc010de0) at /home/mlenco/dev/libyang/src/tree_data.c:5908
5908        return type;
(gdb) n
5909    }
(gdb) 
xml_print_leaf (out=0x7f1edb770b70, level=1, node=0x7f1ebc010de0, toplevel=1, options=3) at /home/mlenco/dev/libyang/src/printer_xml.c:283
283                 if (!type) {
(gdb) 
288                 datatype = type->base;
(gdb) 
207         switch (datatype) {
(gdb) 
276             iter = (struct lyd_node_leaf_list *)leaf->value.leafref;
(gdb) 
277             while (iter && (iter->value_type == LY_TYPE_LEAFREF)) {
(gdb) 
280             if (!iter) {
(gdb) 
282                 type = lyd_leaf_type((struct lyd_node_leaf_list *)leaf);
(gdb) 
283                 if (!type) {
(gdb) 
288                 datatype = type->base;
(gdb) 
207         switch (datatype) {
(gdb) 
276             iter = (struct lyd_node_leaf_list *)leaf->value.leafref;
(gdb) 
277             while (iter && (iter->value_type == LY_TYPE_LEAFREF)) {
(gdb) 
280             if (!iter) {
(gdb) 
282                 type = lyd_leaf_type((struct lyd_node_leaf_list *)leaf);
(gdb) 
283                 if (!type) {
(gdb) 
288                 datatype = type->base;
(gdb) 
207         switch (datatype) {
(gdb) 
276             iter = (struct lyd_node_leaf_list *)leaf->value.leafref;
(gdb) 
277             while (iter && (iter->value_type == LY_TYPE_LEAFREF)) {
(gdb) 
280             if (!iter) {
(gdb) 
282                 type = lyd_leaf_type((struct lyd_node_leaf_list *)leaf);
(gdb) 
283                 if (!type) {
(gdb) 
288                 datatype = type->base;
(gdb) 
207         switch (datatype) {
(gdb) 
276             iter = (struct lyd_node_leaf_list *)leaf->value.leafref;
...
michalvasko commented 7 years ago

Hi, should be fixed now.

Michal

milanlenco commented 7 years ago

Now we get a segfault in a test that previously passed:

#0  malloc_consolidate (av=av@entry=0x7ffff6eaab20 <main_arena>) at malloc.c:4168
#1  0x00007ffff6b671e8 in _int_free (av=0x7ffff6eaab20 <main_arena>, p=<optimised out>, have_lock=0) at malloc.c:4076
#2  0x00007ffff6b6aabc in __GI___libc_free (mem=<optimised out>) at malloc.c:2969
#3  0x00007ffff6f32e0d in lyxp_exp_free (exp=0x84eb50) at /home/mlenco/dev/libyang/src/xpath.c:67
#4  0x00007ffff6f462c9 in lyxp_eval (expr=0x802140 "ifType != 'atm' or (ifType = 'atm' and ifMTU <= 17966 and ifMTU >= 64)", cur_node=0x84ede0, cur_node_type=LYXP_NODE_ELEM, local_mod=0x80b2e0, set=0x7fffffffd320
, options=1) at /home/mlenco/dev/libyang/src/xpath.c:7347
#5  0x00007ffff6eca58d in resolve_must (node=0x84ede0, inout_parent=0, ignore_fail=0) at /home/mlenco/dev/libyang/src/resolve.c:5398
#6  0x00007ffff6ece818 in resolve_unres_data_item (node=0x84ede0, type=UNRES_MUST, ignore_fail=0) at /home/mlenco/dev/libyang/src/resolve.c:7126
#7  0x00007ffff6ecf18d in resolve_unres_data (unres=0x84f8b0, root=0x854c30, options=16897) at /home/mlenco/dev/libyang/src/resolve.c:7353
#8  0x00007ffff6f317cc in lyd_defaults_add_unres (root=0x854c30, options=16897, ctx=0x7f79f0, data_tree=0x86ba00, act_notif=0x0, unres=0x84f8b0) at /home/mlenco/dev/libyang/src/tree_data.c:6212
#9  0x00007ffff6f2cfd3 in lyd_validate (node=0x854c30, options=16897, var_arg=0x7f79f0) at /home/mlenco/dev/libyang/src/tree_data.c:4229
#10 0x00000000004abcd9 in dm_validate_session_data_trees (dm_ctx=0x7a4f30, session=0x84c5a0, errors=0x7fffffffd658, err_cnt=0x7fffffffd660) at /home/mlenco/dev/sysrepo/src/data_manager.c:2339
#11 0x000000000049161b in rp_dt_commit (rp_ctx=0x79bdb0, session=0x854fe0, c_ctx=0x0, errors=0x7fffffffd658, err_cnt=0x7fffffffd660) at /home/mlenco/dev/sysrepo/src/rp_dt_edit.c:785
#12 0x000000000041e46e in default_nodes_test (state=0x7924b0) at /home/mlenco/dev/sysrepo/tests/rp_datatree_test.c:1784
#13 0x00007ffff7bd2192 in cmocka_run_one_test_or_fixture () from /usr/local/lib/libcmocka.so.0
#14 0x00007ffff7bd2442 in cmocka_run_one_tests () from /usr/local/lib/libcmocka.so.0
#15 0x00007ffff7bd2850 in _cmocka_run_group_tests () from /usr/local/lib/libcmocka.so.0
#16 0x000000000041eaba in main () at /home/mlenco/dev/sysrepo/tests/rp_datatree_test.c:1902
milanlenco commented 7 years ago

The crash above is actually caused by the latest Radek's commits.

michalvasko commented 7 years ago

That is fixed as well now.

milanlenco commented 7 years ago

Thanks. Without Radek's commits all tests pass now. I will let you know what is the status when we include those commits and your fix...

milanlenco commented 7 years ago

All our unit tests are passing now! Thanks for all the help.