Closed ldd91 closed 5 years ago
Hi @ldd91 , I have not tested lmt 3.2.6 against Lustre 2.12.2, so something important may have changed. Is it possible for you to post the output on stderr of
strace -e open,stat,fstat lmtmetric -m ost
in the ticket?
This is the output
[root@atlantic-221 ~]# strace -e open,stat,fstat lmtmetric -m ost open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=54517, ...}) = 0 open("/lib64/tls/x86_64/libcerebro.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/lib64/tls/x86_64", 0x7ffed96f2860) = -1 ENOENT (No such file or directory) open("/lib64/tls/libcerebro.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/lib64/tls", {st_mode=S_IFDIR|0555, st_size=6, ...}) = 0 open("/lib64/x86_64/libcerebro.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/lib64/x86_64", 0x7ffed96f2860) = -1 ENOENT (No such file or directory) open("/lib64/libcerebro.so.1", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=91936, ...}) = 0 open("/lib64/tls/libcerebro_error.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libcerebro_error.so.0", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=11368, ...}) = 0 open("/lib64/liblua-5.1.so", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=193864, ...}) = 0 open("/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=1137016, ...}) = 0 open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=19288, ...}) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=2151672, ...}) = 0 stat("/sys/fs/lustre/obdfilter", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 stat("/etc/sysconfig/64bit_strstr_via_64bit_strstr_sse2_unaligned", 0x7ffed96e35b0) = -1 ENOENT (No such file or directory) stat("/sys/stat", 0x7ffed96e3e20) = -1 ENOENT (No such file or directory) open("/proc/stat", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/meminfo", 0x7ffed96e3f10) = -1 ENOENT (No such file or directory) open("/proc/meminfo", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/uuid", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/uuid", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/stats", 0x7ffed96e2ec0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0000/stats", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/brw_stats", 0x7ffed96e2e80) = -1 ENOENT (No such file or directory) lmtmetric: error reading lustre ib-lfs-OST0000 brw_stats: No such file or directory stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/filesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/filesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/filestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/filestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/kbytesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/kbytesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/kbytestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/kbytestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/num_exports", 0x7ffed96e2e90) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0000/num_exports", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/lock_count", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/lock_count", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/pool/grant_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/pool/grant_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/pool/cancel_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0000_UUID/pool/cancel_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0000/recovery_status", 0x7ffed96e2ea0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0000/recovery_status", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/uuid", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/uuid", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/stats", 0x7ffed96e2ec0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0002/stats", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/brw_stats", 0x7ffed96e2e80) = -1 ENOENT (No such file or directory) lmtmetric: error reading lustre ib-lfs-OST0002 brw_stats: No such file or directory stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/filesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/filesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/filestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/filestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/kbytesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/kbytesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/kbytestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/kbytestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/num_exports", 0x7ffed96e2e90) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0002/num_exports", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/lock_count", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/lock_count", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/pool/grant_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/pool/grant_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/pool/cancel_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0002_UUID/pool/cancel_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0002/recovery_status", 0x7ffed96e2ea0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0002/recovery_status", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/uuid", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/uuid", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/stats", 0x7ffed96e2ec0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0004/stats", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/brw_stats", 0x7ffed96e2e80) = -1 ENOENT (No such file or directory) lmtmetric: error reading lustre ib-lfs-OST0004 brw_stats: No such file or directory stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/filesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/filesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/filestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/filestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/kbytesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/kbytesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/kbytestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/kbytestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/num_exports", 0x7ffed96e2e90) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0004/num_exports", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/lock_count", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/lock_count", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/pool/grant_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/pool/grant_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/pool/cancel_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0004_UUID/pool/cancel_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0004/recovery_status", 0x7ffed96e2ea0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0004/recovery_status", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/uuid", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/uuid", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/stats", 0x7ffed96e2ec0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0006/stats", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/brw_stats", 0x7ffed96e2e80) = -1 ENOENT (No such file or directory) lmtmetric: error reading lustre ib-lfs-OST0006 brw_stats: No such file or directory stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/filesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/filesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/filestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/filestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/kbytesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/kbytesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/kbytestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/kbytestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/num_exports", 0x7ffed96e2e90) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0006/num_exports", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/lock_count", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/lock_count", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/pool/grant_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/pool/grant_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/pool/cancel_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0006_UUID/pool/cancel_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0006/recovery_status", 0x7ffed96e2ea0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0006/recovery_status", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/uuid", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/uuid", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/version", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/version", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/stats", 0x7ffed96e2ec0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0008/stats", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/brw_stats", 0x7ffed96e2e80) = -1 ENOENT (No such file or directory) lmtmetric: error reading lustre ib-lfs-OST0008 brw_stats: No such file or directory stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/filesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/filesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/filestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/filestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/kbytesfree", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/kbytesfree", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/kbytestotal", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/kbytestotal", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/num_exports", 0x7ffed96e2e90) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0008/num_exports", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/lock_count", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/lock_count", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/pool/grant_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/pool/grant_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/pool/cancel_rate", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 open("/sys/fs/lustre/ldlm/namespaces/filter-ib-lfs-OST0008_UUID/pool/cancel_rate", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 stat("/sys/fs/lustre/obdfilter/ib-lfs-OST0008/recovery_status", 0x7ffed96e2ea0) = -1 ENOENT (No such file or directory) open("/proc/fs/lustre/obdfilter/ib-lfs-OST0008/recovery_status", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 ost: 2;atlantic-221.unisound.ai;0.568956;98.805783;ib-lfs-OST0000;89466705;89603584;90917613464;90983835132;649351938048;624168555032;0;4;72;0;0;0;0;COMPLETE 1/1 0s remaining;ib-lfs-OST0002;89466688;89603584;90936891912;90983835132;382725885952;252159491847;0;4;49;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0004;89466701;89603584;90869858836;90983835132;368264495104;322765724959;0;4;61;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0006;89466711;89603584;90916900060;90983835132;296974934016;1327194620002;0;4;58;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0008;89466719;89603584;90863260060;90983835132;876284801024;675902515283;0;4;33;0;0;0;0;COMPLETE 0/1 0s remaining; +++ exited with 0 +++
Are you sure you didn't make a typo in your original testing?
I ask because it looks like this succeeded:
[root@atlantic-221 ~]# strace -e open,stat,fstat lmtmetric -m ost
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
<redacted>
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
ost: 2;atlantic-221.unisound.ai;0.568956;98.805783;ib-lfs-OST0000;89466705;89603584;90917613464;90983835132;649351938048;624168555032;0;4;72;0;0;0;0;COMPLETE 1/1 0s remaining;ib-lfs-OST0002;89466688;89603584;90936891912;90983835132;382725885952;252159491847;0;4;49;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0004;89466701;89603584;90869858836;90983835132;368264495104;322765724959;0;4;61;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0006;89466711;89603584;90916900060;90983835132;296974934016;1327194620002;0;4;58;0;0;0;0;INACTIVE 0s remaining;ib-lfs-OST0008;89466719;89603584;90863260060;90983835132;876284801024;675902515283;0;4;33;0;0;0;0;COMPLETE 0/1 0s remaining;
+++ exited with 0 +++
Note that the command exited with 0 (success) and shows output for 5 OST's - ib-lfs-OST000{0,2,4,6,8}
Yes,i didn't make a typo in my original testing,I exec the command 'lmtmetric -m ost' in oss server again and the output is same:lmtmetric: error reading ib-lfs-OST0000 brw_stats: No such file or directory :lmtmetric: error reading ib-lfs-OST0002 brw_stats: No such file or directory :lmtmetric: error reading ib-lfs-OST0004 brw_stats: No such file or directory :lmtmetric: error reading ib-lfs-OST0006 brw_stats: No such file or directory :lmtmetric: error reading ib-lfs-OST0008 brw_stats: No such file or directory
I do see the
lmtmetric: error reading ib-lfs-OST0000 brw_stats: No such file or directory
errors in your strace output, so I'll look at how to fix that.
I believe it's a separate problem from whatever resulted in
No live file system data found
when you ran ltop, but we can fix this issue and then see what happens with ltop.
Thank you for your timely reply,i am looking forward to your conclusion
I found the relevant code and the reason for the error message. There used to be a brw_strats procfile provided by a module named "obdfilter". This stats file was used to populate the "IOPS" column ltop shows. This stats file is not provided under Lustre 2.12.2 (I'm not sure about Lustre 2.12.0).
This is not a fatal error. Early versions of lustre did not provide those stats either, so lmt issues the error message but continues to gather the other stats and send them.
In addition, I set up a test system running Luste 2.12.2 and lmt 3.2.6, and ltop works properly even though I also see the message
lmtmetric: error reading lforge-OST0000 brw_stats: No such file or directory
Is it possible for you to run
lmt_metric -m ost
on all your OSS nodes, and
lmt_metric -m lmt_mdt
on all your MDS nodes, put all the output (including stdout and stderr) in a file, and attach them to the ticket? thanks.
hi ofaaland, I install LMT Manage Server in an server with an Ethernet card,and my lustre clustre all are using Infiniband,but they are in the same subnet.I don't know if that's going to cause this problem
On any of your nodes, you should be able to see cerebro sending the metric data, using tcpdump, like this:
tcpdump -i XXX | grep cerebro
where XXX is the name of the interface (ie eth0, or ib0) which you configured cerebro to use in /etc/cerebro.conf.
You should see messages like this:
08:32:50.783274 IP YYY.cerebro-send > 239.2.11.72.cerebro-recv: UDP, length 302
08:32:50.825049 IP YYY.cerebro-send > 239.2.11.72.cerebro-recv: UDP, length 72
08:32:51.029782 IP YYY.cerebro-send > 239.2.11.72.cerebro-recv: UDP, length 72
08:32:51.031473 IP YYY.cerebro-send > 239.2.11.72.cerebro-recv: UDP, length 420
where I've replaced my hostnames with YYY
Run this on one MDS node, one OSS node, and on the node with lmt-server installed. They should all see the same set of messages. If they don't, then your cerebro config or network config may be the problem.
Hi ofaaland,I run tcpdump -i ens192 | grep cerebro and it shows nothing,i think it is caused by network config
Hi @ldd91 none of them show anything? Or you see output only on the MDS and OSS nodes?
All of them show nothing
@ldd91 ,
All of them show nothing
That probably means that the interface you're monitoring with tcpdump is not the one cerebro is using. To see the address you have cerebro configured for, do this:
# grep cerebrod_speak_message_config /etc/cerebro.conf
cerebrod_speak_message_config 0.0.0.0 0 0 192.168.64.0/24
You can then grep for that address in your configured network interfaces
# ip addr | grep 192.168
inet 192.168.64.1/24 brd 192.168.64.255 scope global eno1
And make sure that the address and netmask specified in cerebro.conf match the address and netmask of a configured interface.
And check to confirm the cerebrod service is running.
In any case, though, this is a cerebro configuration problem, not an LMT problem. So I'm going to close this issue and re-title it to reflect what we found. If you'd like more help, please create an issue at
Lustre 2.12 no longer creates /proc/fs/lustre/obdfilter/brw_stats file. This causes the "file not found" error message. However the error is not fatal - it only causes the IOPS field not to be populated.
I use lmt 3.2.6 install in one of my lustre enviroment which version is 2.12.0,everything goes well,and ltop works,however i install lmt in another lustre which version is 2.12.2 and used Infiniband network,when i exec /usr/sbin/lmtmetric -m mdt it goes well .but when i exec /usr/sbin/lmtmetric -m ost in OSS server it shows lmtmetric: error reading lustre iblfs-OST0000: No such file or directory.I exec ltop in manager node it shows ltop: No live file system data found