TritonDataCenter / manta-thoth

Thoth is a Manta-based system for core and crash dump management
16 stars 7 forks source link

Thoth finds "nothing to do" on CNs containing dumps that have no thoth binary #172

Open chudley opened 6 years ago

chudley commented 6 years ago

Thoth will report "nothing to do" on a compute node that doesn't have a thoth binary, even though there are dumps to discover. The reason for this isn't made very apparent in the logs which report "nothing to do".

I manually expanded the discovery command (only for /zones/*/cores/core.*) and ran it on the compute node I was having problems with. Here's the current directory listing:

[root@XXXX (xx-xxxx-xx) ~]# ls -1 /zones/*/cores/core.*
/zones/xxx/cores/core.node.14175
/zones/xxx/cores/core.node.54445
/zones/xxx/cores/core.node.98220
/zones/xxx/cores/core.node.98225
/zones/xxx/cores/core.node.98226

And here's the result of running the discovery command:

[root@XXXX (xx-xxxx-xx) ~]# for d in `ls -1 /zones/*/cores/core.* 2> /dev/null`; do size=`ls -l $d | awk '{ print $5 }'`; hash=`thoth object $d`; if [[ "$?" -eq 0 ]]; then echo $d $hash $size; fi; done
-bash: thoth: command not found
-bash: thoth: command not found
-bash: thoth: command not found
-bash: thoth: command not found
-bash: thoth: command not found

I think what's happening here is that we have the error message on stderr, but stdout (where we expect to find the dumps) is empty (see here). We seem to be using ur and I've not yet dug into how that error condition is handled, but it's quite possible that we also have a "stderr" property that we can check for and report on.