R-nvim / R.nvim

Neovim plugin to edit R files
GNU General Public License v3.0
155 stars 16 forks source link

Segmentation Fault in nvimcom When Opening and Closing R Files Without Starting R #209

Closed caeu closed 1 month ago

caeu commented 1 month ago

I’m encountering a segmentation fault with nvimcom on a CentOS 7 remote server when using Neovim 10.1 with the R.nvim plugin, installed using LazyVim. The issue occurs only if I don't start an R session, i.e., when I open an R file in Neovim and then close it without starting an R session. This results in a core dump file being generated.

I’ve attempted reinstalling the plugin using both older and more recent versions of GCC, but the issue persists across all attempts, though randomly sometimes it doesn't. Below is the gdb output of the core dump:

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver...(no debugging symbols found)...done.
BFD: Warning: /domus/h1/caesar/R_test/core.38807 is truncated: expected core file size >= 2314240, found: 10240.
[New LWP 38807]
Failed to read a valid object file image from memory.
Core was generated by `/domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver'.
Program terminated with signal 11, Segmentation fault.
#0  0x00002b5fc2629b81 in ?? ()

P.S. I also R.nvim on a mac with apple silicon with the exact same setup. Haven't notice anything weird, perhaps something needs to be set to get similar dump file, or I don't know where to look.

BTW, R.nvim is incredibly feature-rich, works so nicely and has an excellent documentation 🏆.

jalvesaq commented 1 month ago

What is crashing is rnvimserver. Although created by R during nvimcom complication, it's never used by nvimcom itself.

I can't replicate the bug. Could you, please, follow the instructions from https://github.com/R-nvim/R.nvim/wiki/Debugging-C-code-with-Valgrind to add the debug symbols to rnvimserver and run it through Valgrind?

caeu commented 1 month ago

This is the output of /tmp/rnvimserver_valgrind_log and gdb inspection Not sure why gdb still show (no debugging symbols found)

==9030== Memcheck, a memory error detector
==9030== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9030== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==9030== Command: /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver
==9030== Parent PID: 24318
==9030==
==9030== Invalid read of size 4
==9030==    at 0x4E43B81: pthread_cancel (in /usr/lib64/libpthread-2.17.so)
==9030==    by 0x4053BA: stop_server (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==    by 0x403EFC: stdin_loop (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==    by 0x401458: main (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==  Address 0x2d0 is not stack'd, malloc'd or (recently) free'd
==9030==
==9030==
==9030== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==9030==  Access not within mapped region at address 0x2D0
==9030==    at 0x4E43B81: pthread_cancel (in /usr/lib64/libpthread-2.17.so)
==9030==    by 0x4053BA: stop_server (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==    by 0x403EFC: stdin_loop (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==    by 0x401458: main (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==9030==  If you believe this happened as a result of a stack
==9030==  overflow in your program's main thread (unlikely but
==9030==  possible), you can try to increase the size of the
==9030==  main thread stack using the --main-stacksize= flag.
==9030==  The main thread stack size used in this run was 16777216.
==9030==
==9030== HEAP SUMMARY:
==9030==     in use at exit: 1,857,433 bytes in 823 blocks
==9030==   total heap usage: 852 allocs, 29 frees, 1,992,246 bytes allocated
==9030==
==9030== LEAK SUMMARY:
==9030==    definitely lost: 0 bytes in 0 blocks
==9030==    indirectly lost: 0 bytes in 0 blocks
==9030==      possibly lost: 0 bytes in 0 blocks
==9030==    still reachable: 1,857,433 bytes in 823 blocks
==9030==         suppressed: 0 bytes in 0 blocks
==9030== Reachable blocks (those to which a pointer was found) are not shown.
==9030== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==9030==
==9030== For lists of detected and suppressed errors, rerun with: -s
==9030== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

and the output of gdb nvimserver core_dump

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver...(no debugging symbols found)...done.
[New LWP 9030]
Cannot access memory at address 0x4223128
Cannot access memory at address 0x4223120
Core was generated by `'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000004e43b81 in ?? ()
jalvesaq commented 1 month ago

Still, there are no debugging symbols. Did you manage to run gcc manually and compile with the -g flag? Ideally, the output will tell the exact line of the SIGSEGV.

jalvesaq commented 1 month ago

I'm checking the value of a specific variable at the stop_server function on the c_bugs branch. Could you try it, please?

caeu commented 1 month ago

My bad, I followed the instruction too mechanically, and did not run the recompile script. I tried again, still didn't work. Now I am getting different issue and still no debugging symbols. Here is the output:

Will try the c_bug branch.

==21677== Memcheck, a memory error detector
==21677== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21677== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==21677== Command: /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver
==21677== Parent PID: 11913
==21677==
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- Ignoring non-Dwarf2/3/4 block in .debug_info
--21677-- WARNING: Serious error when reading debug info
--21677-- When reading debug info from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver:
--21677-- parse_CU_Header: is neither DWARF2 nor DWARF3 nor DWARF4
==21677== Invalid read of size 4
==21677==    at 0x4E43B81: pthread_cancel (in /usr/lib64/libpthread-2.17.so)
==21677==    by 0x40680B: stop_server (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==    by 0x405DCB: stdin_loop (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==    by 0x405E72: main (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==  Address 0x2d0 is not stack'd, malloc'd or (recently) free'd
==21677==
==21677==
==21677== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==21677==  Access not within mapped region at address 0x2D0
==21677==    at 0x4E43B81: pthread_cancel (in /usr/lib64/libpthread-2.17.so)
==21677==    by 0x40680B: stop_server (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==    by 0x405DCB: stdin_loop (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==    by 0x405E72: main (in /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver)
==21677==  If you believe this happened as a result of a stack
==21677==  overflow in your program's main thread (unlikely but
==21677==  possible), you can try to increase the size of the
==21677==  main thread stack using the --main-stacksize= flag.
==21677==  The main thread stack size used in this run was 16777216.
==21677==
==21677== HEAP SUMMARY:
==21677==     in use at exit: 1,857,433 bytes in 823 blocks
==21677==   total heap usage: 849 allocs, 26 frees, 1,990,726 bytes allocated
==21677==
==21677== LEAK SUMMARY:
==21677==    definitely lost: 0 bytes in 0 blocks
==21677==    indirectly lost: 0 bytes in 0 blocks
==21677==      possibly lost: 0 bytes in 0 blocks
==21677==    still reachable: 1,857,433 bytes in 823 blocks
==21677==         suppressed: 0 bytes in 0 blocks
==21677== Reachable blocks (those to which a pointer was found) are not shown.
==21677== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==21677==
==21677== For lists of detected and suppressed errors, rerun with: -s
==21677== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver...Dwarf Error: wrong version in compilation unit header (is 5, should be 2, 3, or 4) [in module /domus/h1/caesar/.local/lib/R_lib/R_pkgs/nvimcom/bin/rnvimserver]
(no debugging symbols found)...done.
[New LWP 21677]
Cannot access memory at address 0x4223128
Cannot access memory at address 0x4223120
Core was generated by `'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000004e43b81 in ?? ()
caeu commented 1 month ago

Just tried the c_bug branch, and same. IDK if this helps, the system is old centOS 7 with glibc 2.17. Not sure if this has an effect. Compiling stuff on it usually works if all the dependencies are in check, but precompiled binaries usually don't.

jalvesaq commented 1 month ago

@caeu, I believe the bug is fixed in the pull request #212.

caeu commented 1 month ago

Awesome! I confirm. Thanks a million