Closed dagoodma closed 4 years ago
Hi, can you run strace -e open lmtmetric -m osc and post the last 20 lines? Thanks
I ran strace with lmtmetric -m osc
, and seems like I'm missing some libraries (like cerebro--though yum shows I have cerebro-1.18-1.x8664 installed). Note that I do have a /proc/fs/lustre_ directory, and osc/ within contains links that look correct.
18:52:37 # strace -e open lmtmetric -m osc open("/usr/lib64/mpich/lib/tls/x86_64/libcerebro.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/mpich/lib/tls/libcerebro.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/mpich/lib/x86_64/libcerebro.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/mpich/lib/libcerebro.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 open("/usr/lib64/libcerebro.so.1", O_RDONLY) = 3 open("/usr/lib64/mpich/lib/libcerebro_error.so.0", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/libcerebro_error.so.0", O_RDONLY) = 3 open("/usr/lib64/mpich/lib/liblua-5.1.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/liblua-5.1.so", O_RDONLY) = 3 open("/usr/lib64/mpich/lib/libm.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib64/libm.so.6", O_RDONLY) = 3 open("/usr/lib64/mpich/lib/libdl.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib64/libdl.so.2", O_RDONLY) = 3 open("/usr/lib64/mpich/lib/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib64/libc.so.6", O_RDONLY) = 3 open("/etc/lmt/lmt.conf", O_RDONLY) = 3 open("/etc/lmt/rwpasswd", O_RDONLY) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/osc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 lmtmetric: osc metric: No such file or directory +++ exited with 0 +++
Looking for cerebro libraries... they do exist in /usr/lib64, but not in /usr/lib64/mpich/lib/tls. Same with lua-devel libraries.
@dagoodma,hello,I have the same problem,Did you solve it?
@ShijunDeng No. I haven't dug into it much, but I will update this issue thread if I make any progress.
I'm having the same issue, but I'm not getting any ENOENT errors. In fact the output of the strace command is fairly minimal:
[root@mds1 ~]# strace -e open lmtmetric -m osc
open("/etc/ld.so.cache", O_RDONLY) = 3
open("/usr/lib64/libcerebro.so.1", O_RDONLY) = 3
open("/usr/lib64/libcerebro_error.so.0", O_RDONLY) = 3
open("/usr/lib64/liblua-5.1.so", O_RDONLY) = 3
open("/lib64/libm.so.6", O_RDONLY) = 3
open("/lib64/libdl.so.2", O_RDONLY) = 3
open("/lib64/libc.so.6", O_RDONLY) = 3
open("/proc/fs/lustre/osc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
lmtmetric: osc metric: No such file or directory
+++ exited with 0 +++
Along with this, the node that is running the lmt-server package is getting a bunch of errors when attempting to connect to the database:
[root@tillit ~]# tail -f /var/log/messages
Dec 15 13:30:51 tillit /usr/sbin/cerebrod[34864]: strstr: boottime can't be found
Dec 15 13:30:52 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: connected to database
Dec 15 13:30:52 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0004: no database
Dec 15 13:30:52 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0001: no database
Dec 15 13:30:53 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0006: no database
Dec 15 13:30:54 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0002: no database
Dec 15 13:30:55 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0005: no database
Dec 15 13:30:55 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0003: no database
Dec 15 13:30:56 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0000: no database
Dec 15 13:30:56 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-MDT0000: no database
Dec 15 13:30:57 tillit /usr/sbin/cerebrod[34864]: lmt_mysql: blizzard-OST0004: no database
Does anyone have any idea what might be going on? By the way, we're running CentOS 6.8 and lmt 3.1.8 on Lustre 2.8.
It appears that the links in /proc/fs/lustre/osc/ are discarded in proc.c line 226. A change from if ((flag & PROC_READDIR_NOFILE) && d->d_type != DT_DIR)
to if ((flag & PROC_READDIR_NOFILE) && d->d_type != DT_DIR && d->d_type != DT_LNK) Makes lmtmetric return values but I don't know enough of lmt to say this is a safe fix.
MarkW
@6speedlt1 thanks for catching that.
Happy to help.
On Oct 28, 2019, at 12:30 AM, Olaf Faaland notifications@github.com wrote:
@6speedlt1 thanks for catching that.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
I'm having trouble getting lmtmetric to work with lustre 2.8.
I see my OST names listed under /proc/fs/lustre/obdfilter, so I'm not sure what's wrong.
Note that
lmtmetric -m ost
andlmtmetric -m mdt
seem to work. I haven't got a chance to try any of this on a lustre 2.7 system yet.PS. I built lmt 3.2.2 from source on CentOS 6.8 (2.6.32-642.1.1.el6.x86_64).