draios / sysdig

Linux system exploration and troubleshooting tool with first class support for containers
http://www.sysdig.com/
Other
7.74k stars 729 forks source link

spy_users chisel error: attempt to concatenate a nil value #940

Open jonjensen opened 7 years ago

jonjensen commented 7 years ago

I got this error:

spy_users chisel error: [string "--[[..."]:215: attempt to concatenate a nil value

on this OS:

# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core)

Using the sysdig-0.17.0-1.x86_64 RPM built by Sysdig Inc.

Running this command:

sysdig -c spy_users

The last command shown was:

   37112 11:30:13 root) /sbin/ip6tables -nL
luca3m commented 7 years ago

Can you share with us a capture that replicates the problem? You can create a capture with: sysdig -w trace.scap

lmn0 commented 6 years ago

@jonjensen I tried to reproduce this issue on 7.4.1708 (Core) CentOS distro running on EC2 and was not getting the chisel error. Could you provide the exact steps to reproduce this bug?

abucodonosor commented 6 years ago

I'm getting that from time to time too .. I'll try to find an reproducer.

abucodonosor commented 6 years ago

and just found one:


     85283 04:40:24 crazy) grep -E /strace-.*.tar.xz
   85283 04:40:26 crazy) sh -c source /usr/lib/frugalware/fwmakepkg;source ./FrugalBuild; echo -n $pkgname
spy_users chisel error: [string "--[[..."]:215: attempt to concatenate a nil value

However running these command won't make it return the same error.

Strange thing is after starting sysdig again I got an segfault.

85283 05:06:28 crazy) sed s/%2B/+/g;s/$//;s///;s/-/_/g
85283 05:06:35 crazy) find . -name strace
sysdig: /home/crazy/Work/Frugalware/current-testing/source/apps/sysdig/src/sysdig/userspace/libsinsp/parsers.cpp:1799: static void sinsp_parser::parse_openat_dir(sinsp_evt*, char*, int64_t, std::__cxx11::string*): Assertion `false' failed.
Abgebrochen (Speicherabzug geschrieben)
jonjensen commented 6 years ago

@luca3m @tjskrish I am still having this happen, but I can't tell you the steps to reproduce it. I just start sysdig -c spy_users on our server and within hours or a day or so, I get the error. As before it was:

spy_users chisel error: [string "--[[..."]:215: attempt to concatenate a nil value

Our system (with newer versions than when I initially reported):

# rpm -q sysdig
sysdig-0.21.0-1.x86_64
# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 

I saved a trace.scap file like @luca3m requested, but it is 81 GB large! I'm guessing that will not be useful to you, but if you would like to look at it somehow, or have me filter or extract just part, I'm game. Please let me know. Thanks.

abucodonosor commented 6 years ago

@luca3m @tjskrish

I can trigger that all the time with parts of our testuite.

It looks like evt.field(fargs) getting nil sometimes. I'm not an lua guy but I've added nil and 0 and '' checks .. and then set fargs to NA so it cannot be nil however the bug still occurs.

Also interesting is after changing print() to use format.string(..) the bug is gone here. At least I've run the testsuite 10 times without to trigger.

Does this make sense to anyone ?

webloft commented 4 years ago

Is this issue not fixed yet? I got the same error like every 10 min but in line 218

abucodonosor commented 4 years ago

@webloft

This bug still exists.

webloft commented 4 years ago

Well, I don't want to debug this in deep - I just found out that the reported error line is just the last line of the statement.

The actual error (at least for me) is located at: process_tree[pid][2] which gets nil sometimes because something went wrong with the anchester tree loop I guess.

Capturing the nil like

if process_tree[pid][2] == nil then
   process_tree[pid][2] = -1
end

right after

if not process_tree[pid] then
   process_tree[pid] = {1 + process_tree[ppid][1], process_tree[ppid][2]}
end

works for me. Still better to log -1 than crashing every few minutes.

svaningelgem commented 3 years ago

v0.27.1: the bug still exists. I reproduced this easily with:

screen 1: sysdig -c spy_users

screen 2: tail -f /var/log/syslog

Every minute I have like 3-4 processes starting, so maybe too much information to combine?

Linux mail.salvania.be 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux
Description:    Debian GNU/Linux 10 (buster)
github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

webloft commented 1 year ago

Keep this open...