duncs / clusterssh

Cluster SSH - Cluster Admin Via SSH
https://github.com/duncs/clusterssh/wiki
897 stars 79 forks source link

Segmentation Fault error #48

Closed lserena1 closed 8 years ago

lserena1 commented 8 years ago

Hi all,

I'm trying to install clusterssh 4.04 on a RHEL 6 server, with

perl-Tk-804.028-12.el6.x86_64 perl-X11-Protocol-0.56-4.el6.noarch

I connect via PuTTY. xterm on its own starts ok.

cssh -e devjenkins01 seems to complete ok.

cssh -u gets me a Segmentation Fault $ cssh --debug 4 -u Loading in config file: /home/lserena/.clusterssh/config Loading in config file: /home/lserena/.clusterssh/config Looking for xterm Looking for xterm in /usr/local/bin Looking for xterm in /bin Looking for xterm in /usr/bin Found at /usr/bin/xterm VERSION: 4.04 Fetching font size Done with font size Loading keymaps and keycodes Unknown keycode 16785456 Unknown keycode 16785482 Unknown keycode 5053 Unknown keycode 5052 Unknown keycode 16785456 Unknown keycode 16777618 Unknown keycode 269024801 Unknown keycode 269024769 Unknown keycode 269024770 Unknown keycode 269024771 Unknown keycode 269024772 Unknown keycode 269024773 Unknown keycode 269024774 Unknown keycode 269024775 Unknown keycode 269024776 Unknown keycode 269024777 Unknown keycode 269024778 Unknown keycode 269024803 Unknown keycode 269024802 Unknown keycode 269024779 Unknown keycode 269024780 Unknown keycode 269024800 Loading in clusters from: /etc/clusters Reading clusters from file /etc/clusters No file found to read Loading in clusters from: /home/lserena/.clusterssh/clusters Reading clusters from file /home/lserena/.clusterssh/clusters Loading in config file: /home/lserena/.clusterssh/clusters two=dc1xjmp02 devjenkins01 Registering tag two: dc1xjmp02 devjenkins01 Loading in tags from: /etc/tags Reading tags from file /etc/tags No file found to read Loading in tags from: /home/lserena/.clusterssh/tags Reading tags from file /home/lserena/.clusterssh/tags No file found to read Tag default is not registered create_windows: started Segmentation fault $

cssh devjenkins01 also gets me a Segmentation Fault $ cssh --debug 4 devjenkins01 Loading in config file: /home/lserena/.clusterssh/config Loading in config file: /home/lserena/.clusterssh/config Looking for xterm Looking for xterm in /usr/local/bin Looking for xterm in /bin Looking for xterm in /usr/bin Found at /usr/bin/xterm VERSION: 4.04 Fetching font size Done with font size Loading keymaps and keycodes Unknown keycode 16785456 Unknown keycode 16785482 Unknown keycode 5053 Unknown keycode 5052 Unknown keycode 16785456 Unknown keycode 16777618 Unknown keycode 269024801 Unknown keycode 269024769 Unknown keycode 269024770 Unknown keycode 269024771 Unknown keycode 269024772 Unknown keycode 269024773 Unknown keycode 269024774 Unknown keycode 269024775 Unknown keycode 269024776 Unknown keycode 269024777 Unknown keycode 269024778 Unknown keycode 269024803 Unknown keycode 269024802 Unknown keycode 269024779 Unknown keycode 269024780 Unknown keycode 269024800 Loading in clusters from: /etc/clusters Reading clusters from file /etc/clusters No file found to read Loading in clusters from: /home/lserena/.clusterssh/clusters Reading clusters from file /home/lserena/.clusterssh/clusters Loading in config file: /home/lserena/.clusterssh/clusters two=dc1xjmp02 devjenkins01 Registering tag two: dc1xjmp02 devjenkins01 Loading in tags from: /etc/tags Reading tags from file /etc/tags No file found to read Loading in tags from: /home/lserena/.clusterssh/tags Reading tags from file /home/lserena/.clusterssh/tags No file found to read Resolving cluster names: started Checking tag devjenkins01 Tag devjenkins01 is not registered leaving with devjenkins01 Resolving cluster names: completed create_windows: started Segmentation fault $

Can you please advise how to troubleshoot this further?

Thanks a mill

Loris

lserena1 commented 8 years ago

Running

strace -f /usr/local/bin/cssh --debug 4 devjenkins01

ends with:

stat("/usr/share/fonts", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("/var/cache/fontconfig/3830d5c3ddfd5cd38a049b759396e72e-le64.cache-3", O_RDONLY) = 7 fstat(7, {st_mode=S_IFREG|0644, st_size=96, ...}) = 0 read(7, "\4\374\2\374\3\0\0\0`\0\0\0\0\0\0\0008\0\0\0\0\0\0\0P\0\0\0\0\0\0\0"..., 96) = 96 close(7) = 0 stat("/usr/share/X11/fonts/Type1", 0x7fff2929b300) = -1 ENOENT (No such file or directory) stat("/usr/share/X11/fonts/Type1", 0x7fff2929b410) = -1 ENOENT (No such file or directory) stat("/usr/share/X11/fonts/TTF", 0x7fff2929b300) = -1 ENOENT (No such file or directory) stat("/usr/share/X11/fonts/TTF", 0x7fff2929b410) = -1 ENOENT (No such file or directory) stat("/usr/local/share/fonts", 0x7fff2929b300) = -1 ENOENT (No such file or directory) stat("/usr/local/share/fonts", 0x7fff2929b410) = -1 ENOENT (No such file or directory) stat("/home/lserena/.fonts", 0x7fff2929b300) = -1 ENOENT (No such file or directory) stat("/home/lserena/.fonts", 0x7fff2929b410) = -1 ENOENT (No such file or directory) poll([{fd=6, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=6, revents=POLLOUT}]) writev(6, [{"b\0\4\0\6\0\240\0", 8}, {"RENDER", 6}, {"\0\0", 2}], 3) = 16 poll([{fd=6, events=POLLIN}], 1, -1) = 1 ([{fd=6, revents=POLLIN}]) recvfrom(6, "\1D&\0\0\0\0\0\1\221\0\244\243\243\0\200XY)\2\0\0\0\0\1\0\0\0\304\2\0\0", 4096, 0, NULL, NULL) = 32 recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=6, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=6, revents=POLLOUT}]) writev(6, [{"\221\0\3\0\0\0\0\0\v\0\0\0\221\1\1\0", 16}, {NULL, 0}, {"", 0}], 3) = 16 poll([{fd=6, events=POLLIN}], 1, -1) = 1 ([{fd=6, revents=POLLIN}]) recvfrom(6, "\1\2'\0\0\0\0\0\0\0\0\0\n\0\0\0\0\0\0\0\1\0\0\0\304\2\0\0\30k)\2"..., 4096, 0, NULL, NULL) = 1004 recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(6, 0x1950354, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) open("/home/lserena/.Xdefaults", O_RDONLY) = 7 fstat(7, {st_mode=S_IFREG|0600, st_size=693, ...}) = 0 read(7, "_background: #101010\n_foreground"..., 693) = 693 close(7) = 0 uname({sys="Linux", node="dc1xjmp01.bloombergpolarlake.com", ...}) = 0 open("/home/lserena/.Xdefaults-dc1xjmp01.bloombergpolarlake.com", O_RDONLY) = 7 fstat(7, {st_mode=S_IFREG|0600, st_size=693, ...}) = 0 read(7, "_background: #101010\n_foreground"..., 693) = 693 close(7) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ Segmentation fault

lserena1 commented 8 years ago

From the core file:

gdb perl ./cssh.core.10914

Core was generated by `perl /usr/local/bin/cssh --debug 4 devjenkins01'. Program terminated with signal 11, Segmentation fault.

0 0x00007f21cfbc0545 in Tk_AllocFontFromObj () from /usr/lib64/perl5/vendor_perl/auto/Tk/Tk.so

lserena1 commented 8 years ago

Changing version ot Tk perl-Tk-804.028-2.el6.rf.x86_64 and adding more debuginfo rpms,

gdb perl ./cssh.core.10914

Core was generated by `perl /usr/local/bin/cssh --debug 4 devjenkins01'. Program terminated with signal 11, Segmentation fault.

0 0x00007f21cfbc0545 in Tk_FontObjCmd (clientData=, interp=0x192be88, objc=, objv=) at tkFont.c:802

802 tkFont.c: No such file or directory. in tkFont.c

duncs commented 8 years ago

Did you install Tk manually or was it via a package? This looks to be a compilation/library error within the Tk module, so can you download Tk, compile and run all the tests to see what happens?

lserena1 commented 8 years ago

Thanks for your reply, Duncan.

It was an rpm package. So, I've removed it and downloaded the latest perl Tk (Tk-804.033).

perl Makefile.PL completes ok.

make ends with:

gcc -fPIC -c -Wall -O2 -I../zlib pngwio.c gcc -fPIC -c -Wall -O2 -I../zlib pngwrite.c gcc -fPIC -c -Wall -O2 -I../zlib pngwtran.c gcc -fPIC -c -Wall -O2 -I../zlib pngwutil.c ar rcs libpng.a png.o pngerror.o pngget.o pngmem.o pngpread.o pngread.o pngrio.o pngrtran.o pngrutil.o pngset.o pngtrans.o pngwio.o pngwrite.o pngwtran.o pngwutil.o : libpng.a make[2]: Leaving directory `/home/lserena/Tk-804.033/PNG/libpng' make[2]: Entering directory`/home/lserena/Tk-804.033/PNG/libpng' gcc -fPIC -c -Wall -O2 -I../zlib pngtest.c gcc -fPIC -s -L../zlib -o pngtest pngtest.o libpng.a -lz -lm libpng.a(png.o): In function `png_init_mmx_flags': png.c:(.text+0x8f): undefined reference to`png_mmx_support' libpng.a(pngread.o): In function `png_read_row': pngread.c:(.text+0xc61): undefined reference to`png_combine_row' pngread.c:(.text+0xc87): undefined reference to `png_combine_row' pngread.c:(.text+0xd16): undefined reference to`png_combine_row' pngread.c:(.text+0xd2f): undefined reference to `png_combine_row' pngread.c:(.text+0xd63): undefined reference to`png_read_filter_row' pngread.c:(.text+0xd8b): undefined reference to `png_do_read_interlace' pngread.c:(.text+0xdde): undefined reference to`png_combine_row' pngread.c:(.text+0xe2a): undefined reference to `png_combine_row' pngread.c:(.text+0xe5d): undefined reference to`png_combine_row' pngread.c:(.text+0xe9f): undefined reference to `png_combine_row' pngread.c:(.text+0xed9): undefined reference to`png_combine_row' libpng.a(pngread.o):pngread.c:(.text+0xf08): more undefined references to `png_combine_row' follow collect2: ld returned 1 exit status make[2]: *** [pngtest] Error 1 make[2]: Leaving directory`/home/lserena/Tk-804.033/PNG/libpng' make[1]: **\* [subdirs] Error 2 make[1]: Leaving directory `/home/lserena/Tk-804.033/PNG' make: **\* [subdirs] Error 2 make test ends with: make[1]: Leaving directory `/home/lserena/Tk-804.033/Scale' make[1]: Entering directory`/home/lserena/Tk-804.033/PNG' cd zlib && make libz.a "CC=gcc -fPIC" RANLIB=":" make[2]: Entering directory `/home/lserena/Tk-804.033/PNG/zlib' make[2]:`libz.a' is up to date. make[2]: Leaving directory `/home/lserena/Tk-804.033/PNG/zlib' make[2]: Entering directory`/home/lserena/Tk-804.033/PNG/libpng' gcc -fPIC -s -L../zlib -o pngtest pngtest.o libpng.a -lz -lm libpng.a(png.o): In function `png_init_mmx_flags': png.c:(.text+0x8f): undefined reference to`png_mmx_support' libpng.a(pngread.o): In function `png_read_row': pngread.c:(.text+0xc61): undefined reference to`png_combine_row' pngread.c:(.text+0xc87): undefined reference to `png_combine_row' pngread.c:(.text+0xd16): undefined reference to`png_combine_row' pngread.c:(.text+0xd2f): undefined reference to `png_combine_row' pngread.c:(.text+0xd63): undefined reference to`png_read_filter_row' pngread.c:(.text+0xd8b): undefined reference to `png_do_read_interlace' pngread.c:(.text+0xdde): undefined reference to`png_combine_row' pngread.c:(.text+0xe2a): undefined reference to `png_combine_row' pngread.c:(.text+0xe5d): undefined reference to`png_combine_row' pngread.c:(.text+0xe9f): undefined reference to `png_combine_row' pngread.c:(.text+0xed9): undefined reference to`png_combine_row' libpng.a(pngread.o):pngread.c:(.text+0xf08): more undefined references to `png_combine_row' follow collect2: ld returned 1 exit status make[2]: *** [pngtest] Error 1 make[2]: Leaving directory`/home/lserena/Tk-804.033/PNG/libpng' make[1]: **\* [subdirs] Error 2 make[1]: Leaving directory `/home/lserena/Tk-804.033/PNG' make: **\* [subdirs] Error 2 So... what am I missing? Thanks again.
duncs commented 8 years ago

Looking at those errors, what version of the PNG libraries do you have installed? Are they from standard repositories, too? It almost looks like you have two different versions of them installed or the dev package does not match the binary one as the linker in Tk is not finding code that it believes should be there

More than that I cannot say

Duncs

lserena1 commented 8 years ago

This is what I would call a pretty standard RHEL 6.6 install. It's a RHEV virtual machine, but this shouldn't matter, right?

As far as I can see, all I have is this:

yum list installed | grep -i png libpng.x86_64 2:1.2.49-1.el6_2 @rhel-x86_64-server-6

in /usr/lib64, I see:

ls -ltr | grep -i png -rwxr-xr-x. 1 root root 155456 Apr 18 2012 libpng12.so.0.49.0 -rwxr-xr-x. 1 root root 168616 Apr 18 2012 libpng.so.3.49.0 lrwxrwxrwx. 1 root root 16 Apr 8 2014 libpng.so.3 -> libpng.so.3.49.0 lrwxrwxrwx. 1 root root 18 Apr 8 2014 libpng12.so.0 -> libpng12.so.0.49.0

and they all belong to that "2:libpng-1.2.49-1.el6_2.x86_64" package.

mperry2 commented 8 years ago

Yum install libpng-devel.x86_64 and then try compiling again.

lserena1 commented 8 years ago

Bingo! Top class, thanks a mill for that Matt.

So now (as root) both Tk and ClusterSSH install flawlessly. I can run cssh ok as root, but when I try as an unprivileged user, I get:

$ cssh devjenkins01 & [1] 25397 $ Can't locate loadable object for module Tk::Event in @INC (@INC contains: /usr/local/bin/../lib/perl5 /usr/local/bin/../lib /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/local/lib64/perl5/Tk.pm line 13. Compilation failed in require at /usr/local/lib64/perl5/Tk.pm line 13. BEGIN failed--compilation aborted at /usr/local/lib64/perl5/Tk.pm line 13. Compilation failed in require at /usr/local/share/perl5/App/ClusterSSH.pm line 23. BEGIN failed--compilation aborted at /usr/local/share/perl5/App/ClusterSSH.pm line 23. Compilation failed in require at /usr/local/bin/cssh line 8. BEGIN failed--compilation aborted at /usr/local/bin/cssh line 8. Undefined subroutine &Tk::Event::CleanupGlue called at /usr/local/lib64/perl5/Tk/Event.pm line 3. END failed--call queue aborted at /usr/local/bin/cssh line 8.

[1]+ Exit 22 cssh devjenkins01 $

What am I missing this time?

duncs commented 8 years ago

That suggests to me you did the build and install of Tk perl module as root and files went into root home directory, so when running it as non-root cssh cannot find them or the permissions on the files are root only

as root run

perldoc -lm Tk

and then check if your user can read those files and that perl is looking in that directory (look in the perl output above for @INC to see if the directory is listed)

lserena1 commented 8 years ago

I did install the modules as root indeed. My problem was that root is set umask 0077: files went to the right place but with too strict permissions. This is now resolved. Thanks again for your help, Duncan, much appreciated.