Open erikgrinaker opened 1 year ago
or better, add some of this to metrics so we have history and can work with the data better. Lots of numbers are in procfs
ubuntu@grinaker-231-0001:~$ awk '$1 ~ "Tcp:" { print $13 }' /proc/net/snmp
RetransSegs
906466
Wouldn't be surprised if gosigar
and these other kinds of libraries already picked most of these up
Yes, even better, but there's a ton of OS metrics and I don't know if we want them clogging up the time series database. Maybe we can pick out a few particularly important ones.
Right, didn't mean to scrape everything under the sun, just that we mostly just have to find a library that has what we want and hook up the metrics we need. Not much reinventing the wheel should be needed here.
Some internal discussion here. I'm going to re-title this issue to include more networking metrics+diagnostics info.
It would be useful to include in
debug.zip
and/or metrics various OS/kernel info from every node, to inspect e.g. kernel params, TCP settings, and other relevant metrics when debugging kernel issues. For example:sysctl -a
netstat -s
netstat -an
ps aux
mount
/proc/meminfo
ss --tcp -n -e
And probably lots of other stuff. We'll need to consider what we can and can't include wrt. redaction.
Jira issue: CRDB-27295 Epic: CRDB-32134