Closed PrymalInstynct closed 2 years ago
Thanks for finding this. We will reproduce this and post updates here.
I tried to reproduce this on 0.8.1 and latest master with the following steps:
Scope tail -f
docker run --rm -it cribl/scope:<VERSION_ID>
scope run -- tail -f /var/log/dpkg.log
# In another terminal:
docker exec -it <above container> /bin/bash
apt update && apt install -y python3 # to generate dpkg logs
# Wait 20 seconds to observe failures
Scope bash
docker run --rm -it cribl/scope:<VERSION_ID>
scope bash
tail -f /var/log/dpkg.log
# In another terminal:
docker exec -it <above container> /bin/bash
apt update && apt install -y python3 # to generate dpkg logs
# Wait 20 seconds to observe failures
Scope bash with cribl tls destination (per blog post):
docker run --rm -it cribl/scope:<VERSION_ID>
scope run -c tls://logstream.strange-cannon.cribl.cloud:10090 -- bash
tail -f /var/log/dpkg.log
# In another terminal:
docker exec -it <above container> /bin/bash
apt update && apt install -y python3 # to generate dpkg logs
# Wait 20 seconds to observe failures
I could not reproduce the issue.
Regarding this
I also found the below error in the AppScope application log output
Scope: tail(pid:1950525): [2022-01-27 05:20:15.671] ERROR: Could not get function address of uv_fileno.
This is known. We changed the level of this log recently but it will still appear with a more verbose log level set.
Closing for now. Please provide further information if you want us to re-open and investigate - we will be happy to.
This is still an issue, that I am experiencing on a fresh install of RHEL 8.5 (Not a container). Below are the configs I am using.
SCOPE_HOME=/opt/scope
LD_PRELOAD=/opt/scope/libscope.so
PATH=$PATH:/opt/scope
SCOPE_CRIBL_NO_BREAKER=true
export SCOPE_HOME LD_PRELOAD PATH SCOPE_CRIBL_NO_BREAKER
metric:
enable: false
event:
enable: true
format:
type: ndjson
maxeventpersec: 10000
enhancefs: true
watch:
- type: console
name: (stdout)|(stderr) # matches the output stream
value: .* # matches data written
transport:
type: file
path: '/var/log/appscope/events.log'
buffer: line
libscope:
configevent: true
summaryperiod : 10
commanddir : '/tmp'
log:
level: warning
transport:
type: file
path: '/var/log/appscope/appscope.log'
buffer: line
cribl:
enable: false
tags:
user: $USER
cwd: $PWD
protocol:
custom:
what is the command line you are using with the above config?
Since appscope process is spawned as a part of the bash session starting I am just running tail -f /var/log/
[root@smoke log]# tail -f /var/log/messages
Feb 3 14:32:29 smoke systemd[2347017]: Starting D-Bus User Message Bus Socket.
Feb 3 14:32:29 smoke systemd[2347017]: Reached target Timers.
Feb 3 14:32:29 smoke systemd[2347017]: Reached target Paths.
Feb 3 14:32:29 smoke systemd[2347017]: Listening on D-Bus User Message Bus Socket.
Feb 3 14:32:29 smoke systemd[2347017]: Reached target Sockets.
Feb 3 14:32:29 smoke systemd[2347017]: Reached target Basic System.
Feb 3 14:32:29 smoke systemd[2347017]: Reached target Default.
Feb 3 14:32:29 smoke systemd[2347017]: Startup finished in 188ms.
Feb 3 14:32:29 smoke systemd[1]: Started User Manager for UID 1000.
Feb 3 14:32:29 smoke systemd[1]: Started Session 53 of user prymalinstynct.
tail: error waiting for inotify and output events: Interrupted system call
[prymalinstynct@smoke log]$ tail -f dnf.log
2022-02-03T13:03:08-0700 DEBUG DNF version: 4.7.0
2022-02-03T13:03:08-0700 DDEBUG Command: dnf makecache --timer
2022-02-03T13:03:08-0700 DDEBUG Installroot: /
2022-02-03T13:03:08-0700 DDEBUG Releasever: 8
2022-02-03T13:03:08-0700 DEBUG cachedir: /var/cache/dnf
2022-02-03T13:03:08-0700 DDEBUG Base command: makecache
2022-02-03T13:03:08-0700 DDEBUG Extra commands: ['makecache', '--timer']
2022-02-03T13:03:08-0700 DEBUG Making cache files for all metadata files.
2022-02-03T13:03:08-0700 INFO Metadata cache refreshed recently.
2022-02-03T13:03:08-0700 DDEBUG Cleaning up.
tail: error waiting for inotify and output events: Interrupted system call
hey. we sort of recreated what you are seeing. we could not repro directly. we created a RHEL 8.4 EC2 VM. taking the same steps you defined we did not see the issue. however, theorizing about the cause and the staring at the error message, it seems as though the timer we use should be stopped and is not getting stopped in your case. so, if we force the timer not to stop in test code we see the same error message from tail that you are seeing.
we know what is happening. we have a theory as to why it's happening. given that we are not able to repro directly, would you be willing to run a test build that we provide? it should allow us to verify that we have this fixed.
Thanks for looking into it further. I am happy to help however I can.
Will you please try this test version? It's built f4rom the draft PR 757.
$ curl -O https://cdn.cribl.io/dl/scope/bug/728-tail-f/linux/scope
$ chmod +x scope
$ ./scope version
Version: d1cf379
Build Date: 2022-02-07T16:21:47Z
Sorry to report this still appears to be an issue
prymalinstynct@wks01-windows:~/Projects/ (main)$ ssh 10.10.1.10
Last login: Tue Feb 8 06:09:23 2022 from 10.10.100.10
[prymalinstynct@smoke scope]$ cd /opt/scope/
[prymalinstynct@smoke scope]$ sudo curl -O https://cdn.cribl.io/dl/scope/bug/728-tail-f/linux/scope
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27.6M 100 27.6M 0 0 3936k 0 0:00:07 0:00:07 --:--:-- 4017k
[prymalinstynct@smoke scope]$ sudo chmod +x scope
[prymalinstynct@smoke scope]$ ./scope version
Version: d1cf379
Build Date: 2022-02-07T16:21:47Z
[prymalinstynct@smoke scope]$ exit
logout
Connection to 10.10.1.10 closed.
prymalinstynct@wks01-windows:~/Projects/ (main)$ ssh 10.10.1.10
Last login: Tue Feb 8 06:11:49 2022 from 10.10.100.10
[prymalinstynct@smoke ~]$ tail -f /var/log/dnf.log
2022-02-08T05:29:54-0700 DEBUG reviving: 'rhel-8-for-x86_64-appstream-rpms' can be revived - repomd matches.
2022-02-08T05:29:55-0700 DEBUG rhel-8-for-x86_64-appstream-rpms: using metadata from Mon 07 Feb 2022 07:14:16 AM MST.
2022-02-08T05:29:55-0700 DEBUG reviving: 'rpmfusion-free-updates' can be revived - repomd matches.
2022-02-08T05:29:55-0700 DEBUG rpmfusion-free-updates: using metadata from Sat 05 Feb 2022 10:51:10 AM MST.
2022-02-08T05:29:55-0700 DEBUG reviving: 'rpmfusion-nonfree-updates' can be revived - repomd matches.
2022-02-08T05:29:55-0700 DEBUG rpmfusion-nonfree-updates: using metadata from Sat 05 Feb 2022 10:57:56 AM MST.
2022-02-08T05:29:55-0700 DEBUG User-Agent: constructed: 'libdnf (Red Hat Enterprise Linux 8.5; generic; Linux.x86_64)'
2022-02-08T05:29:58-0700 DDEBUG timer: sack setup: 6152 ms
2022-02-08T05:29:58-0700 INFO Metadata cache created.
2022-02-08T05:29:58-0700 DDEBUG Cleaning up.
tail: error waiting for inotify and output events: Interrupted system call
Ugh! I think this may be my bad. In the instructions I provided, I failed to suggest that you extract the content of the scope you downloaded. Assuming that you did exactly what you posted, you were using the old libscope and no change would be seen. Which, of course, is what appears to have happened.
The steps become:
$ curl -O https://cdn.cribl.io/dl/scope/bug/728-tail-f/linux/scope
$ chmod +x scope
$ ./scope version
Version: d1cf379
Build Date: 2022-02-07T16:21:47Z
$ ./scope extract /opt/scope
Using /opt/scope in the extract command because in /etc/profile.d/appscope.sh
you are using LD_PRELOAD=/opt/scope/libscope.so
. The extract command will place the updated libscope in the dir you are referencing.
Note that the extract command will place a scope.yml
in /opt/scope
. Given that SCOPE_HOME=/opt/scope
, libscope will use the scope.yml
that is extracted into /opt/scope
. Assuming that the intent is to use the scope.yml in /etc/scope, you will want to remove or rename /opt/scope/scope.yml
.
Thanks for your patience. Will you give this a try...again?
ok, I ran ./scope extract /opt/scope
and am still getting the same error.
here is the output of the appscope.log file I am writing to which errored out the same when i used tail -f
[root@smoke appscope]# tail -f /var/log/appscope/appscope.log
Scope: id(pid:2589593): [2022-02-08 06:13:21.832] ERROR: Could not get function address of uv_fileno.
Scope: cat(pid:2589595): [2022-02-08 06:13:21.842] ERROR: Could not get function address of uv_fileno.
Scope: tail(pid:2589605): [2022-02-08 06:13:33.214] ERROR: Could not get function address of uv_fileno.
Scope: ls(pid:2636249): [2022-02-09 04:19:57.765] ERROR: Could not get function address of uv_fileno.
Scope: mv(pid:2636283): [2022-02-09 04:20:44.052] ERROR: Could not get function address of uv_fileno.
tail: error waiting for inotify and output events: Interrupted system call
[root@smoke appscope]#
@PrymalInstynct
Thanks for your message.
What confuses me is that you are still able to see in the /var/log/appscope/appscope.log
the following message (assuming it comes after you perform a scope extract
operation):
Scope: mv(pid:2636283): [2022-02-09 04:20:44.052] ERROR: Could not get function address of uv_fileno.
This type of message should not appear in the case of libscope.so
library version (mentioned in https://github.com/criblio/appscope/issues/728#issuecomment-1032871183), which comes from scope extract
operation.
In the case of using libscope.so
library coming from scope extract
after executing the 'strings' command against libscope.so
:
strings libscope.so | grep "Scope Version"
You should see
Constructor (Scope Version: d1cf379bcf1a)
Can you verify the command above with /opt/scope/libscope.so
and possibly check if that is the only libscope.so
you have it?
Here is the output of the 2 commands I ran to find all possible instances of libscope.so and then pull the version out of the strings
[root@smoke ~]# find / -name 'libscope.so'
/opt/scope/libscope.so
[root@smoke ~]# strings /opt/scope/libscope.so | grep "Scope Version"
Constructor (Scope Version: d1cf379bcf1a)
Scope Version: %s Dump From: %s
[root@smoke ~]#
For the sake of sharing in case it matters in some way, here is how I am installing and configuring appscope. AppScope Ansible Role
As a side note I would love to see the version included in the package file name so I could dynamically pull the appscope package for github by version number a little easier.
@PrymalInstynct Thanks for your message and valuable input.
The libscope.so
version looks OK.
Thanks for the link AppScope Ansible Role - just FYI here is branch where the tgz file for the Appscope version (Scope Version: d1cf379bcf1a) is placed in /files/
path.
We will investigate the issue further.
Just to ensure:
tail -f /var/log/
as a root user and observe the following error:
tail: error waiting for inotify and output events: Interrupted system call
tail -f /var/log/
as a user different than root and observe the following error:
tail: error waiting for inotify and output events: Interrupted system call
FYI: Regarding the version in the package, I created this as a separate issue #773 to track it.
Yes, I am using the ansible role linked to install appscope in my environment, and yes I get the same behavior as both root and a standard user.
we have found issues with the Ansible play book. however, nothing that would explain the receipt of a signal by tail. can we discuss this synchronously? would you be willing to join our Slack channel? it would be easier to discuss and screen share as needed.
https://cribl-community.slack.com/ channel #appscope
I'll join up on Monday.
That's great. We are looking forward to it.
On Fri, Feb 11, 2022 at 7:06 AM PrymalInstynct @.***> wrote:
I'll join up on Monday.
— Reply to this email directly, view it on GitHub https://github.com/criblio/appscope/issues/728#issuecomment-1036195786, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHIMXB5ZEXNTAHXTRGDFELU2UCV7ANCNFSM5M5YKKJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were assigned.Message ID: @.***>
@iapaddler the link is not letting me sign-in it is telling me I need a cribl.io address to join the workspace.
the community channel should not require a specific email address. send me an email address and i'll send a join email. that should resolve this.
@PrymalInstynct better approach, the AppScope web site has a link to the community channel: https://appscope.dev/
If you can sign up there, we can explore this in detail. Let us know if this works for you.
Yeah, that works. I'll should be able to jump on tomorrow morning.
The root cause of this is that we're missing a function that we need to interpose. Specifically __poll_chk. My understanding is that this function is used by tail due to it being compiled with a gcc toolchain with the _FORTIFY_SOURCE=2 option. We had problems reproducing it because the versions of tail that we tried don't use the __poll_chk symbol. On my machine, for example, tail doesn't use this symbol. Heck, it doesn't even use poll:
ubuntu@ip-10-8-107-159:~/jrc/appscope3$ which tail
/usr/bin/tail
ubuntu@ip-10-8-107-159:~/jrc/appscope3$ tail --version
tail (GNU coreutils) 8.28
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Paul Rubin, David MacKenzie, Ian Lance Taylor,
and Jim Meyering.
ubuntu@ip-10-8-107-159:~/jrc/appscope3$ nm -D /usr/bin/tail | egrep "(poll)|(select)"
U select
I've done some analysis of what we functions need to add to ensure that this doesn't happen in other scenarios. This command finds glibc functions that are provided that end in "_chk".
nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep _chk\$ | grep " T " | gawk '{print $3}' | cut -d_ -f3 > glibc_func_list.txt
This command finds functions that libscope.so provides
nm -D ./lib/linux/x86_64/libscope.so | grep " T " | gawk '{print $3}' > appscope.txt
This command compares the two files we just created above:
for func in $(cat glibc_func_list.txt); do grep "^$func\$" appscope.txt; done | sort | uniq > candidates.txt
The contents of candidates.txt suggests that we need ensure that we have a "_chk version" of the following functions to libscope.so:
fgets
fgetws
fread
poll
ppoll
pread
pread64
read
recv
recvfrom
This then tells us which candidates of concern we already have in libscope.so:
for func in $(cat candidates.txt); do grep "__${func}_chk" appscope.txt; done
Which returned that we already have _chk functions for these:
__fgets_chk
__fgetws_chk
__fread_chk
__pread_chk
Which means by omission that these are the _chk functions that we need to add to libscope.so: poll ppoll pread64 read recv recvfrom
I verified that we are interposing these new functions. In gdb you can see that we are hitting breakpoints in our wrap.c.
Thread 2.1 "a.out" hit Breakpoint 1, __ppoll_chk (fds=0x0, nfds=1, tmo_p=0x1, sigmask=0x0, fdslen=1) at src/wrap.c:1907
Thread 2.1 "a.out" hit Breakpoint 1, __poll_chk (fds=0x0, nfds=1, timeout=1, fdslen=1) at src/wrap.c:1854
Thread 2.1 "a.out" hit Breakpoint 1, __recvfrom_chk (sockfd=3, buf=0x555555558040 <buf>, len=512, buflen=512, flags=-1, src_addr=0x55555556f2b0, addrlen=0x555555555270 <__libc_csu_init>) at src/wrap.c:4796
Thread 2.1 "a.out" hit Breakpoint 1, __recv_chk (sockfd=3, buf=0x555555558040 <buf>, len=512, buflen=512, flags=-1) at src/wrap.c:4741
Thread 2.1 "a.out" hit Breakpoint 2, __pread64_chk (fd=3, buf=0x555555558040 <buf>, count=512, offset=512, bufsize=10) at src/wrap.c:2133
Thread 2.1 "a.out" hit Breakpoint 1, __read_chk (fd=3, buf=0x555555558040 <buf>, nbytes=512, buflen=512) at src/wrap.c:2200
I compiled a test app that called these _chk functions directly, with gcc -g -D_FORTIFY_SOURCE=2 -O2 my_app.c
myapp.c:
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
__read_chk(fd, NULL, 1, 1, 1);
// __pread64_chk(fd, buf, 1, 1);
// __poll_chk(NULL, 1, 1, 1);
// __ppoll_chk(NULL, 1, NULL, NULL, 1);
// __recv_chk(1, NULL, 1, 1, -1);
// __recvfrom_chk(1, NULL, 1, 1, -1, NULL, NULL);
return 0;
}
Steps To Reproduce
Followed article https://cribl.io/blog/bash-history-logging-the-easy-way-with-appscope/ implementing AppScope to enable logging of user commands.
I found that when using tail -f to watch the output of a log file I would get an error to stdout within about 5 seconds and tail would exit
I also found the below error in the AppScope application log output
Environment
Requested priority
Medium
Relevant log output