Closed takikawa closed 6 years ago
Nice work. It looks good to me so far.
I also think snabb top
issues are at inotify level. I would need to look into it to actually understand what's the issue.
It seems like inotify doesn't report events on symbolic links. Here's an example using user-space tool inotifywait
:
$ mkdir temp
$ inotifywait -m temp
Setting up watches.
Watches established.
temp $ touch foo
temp/ OPEN foo
temp/ ATTRIB foo
temp/ CLOSE_WRITE,CLOSE foo
But if the observed file is a symbolic link to a directory
$ ln -s temp tmp
$ inotifywait -m tmp
Setting up watches.
Watches established.
temp $ touch bar
Nothing happens.
Good thing is that when monitoring a directory, creation of symbolic links is reported.
$ inotifywait -m temp
Setting up watches.
Watches established.
temp $ ln -s bar qax
temp/ CREATE qax
I look into the issue. I don't know yet what's the solution but here are some ideas.
I think at a high level view what's necessary to do is when a create
event is received, we check whether the filename is a symbolic link or not. If it's a symbolic link, actually we don't monitor this file (because we won't get any event) but its real path.
To check whether a file is a symbolic link:
local function is_symlink (name)
local stat = S.lstat(name)
return stat and stat.islnk
end
To convert a symbolic link to a real path there's a C runtime function called realpath
. It seems this function is not part of ljsyscall, so it would be necessary to define it with ffi.cdef
.
local ffi = require('ffi')
local C = ffi.C
ffi.cdef[[
char *realpath(const char *path, char *resolved_path);
]]
local function resolve_symlink (name)
realpath = ffi.new("char[?]", 1024)
C.realpath(name, realpath)
return realpath
end
In addition, when converting a symbolic link like /var/run/...
we're going to obtain a path link /run/..
because /var/run
is actually a symbolic link to /run
.
This is just simply an idea, maybe there's another way.
Update: I think I've solved the inotify issue. It was just a matter of changing how the snabb library defines what a directory is for this case. The queue stats now show up again.
I'll see if I can make it filter based on the actually enabled counters.
I pushed another commit that filters out irrelevant queue counters in snabb top
. This change was more complicated than I thought and to be honest I'm not that happy with it, but it works and passes the intel_mp
tests. I think it could use more testing though.
The complication comes from how the rxcounters
/ txcounters
counters work. They're counters that store a bitmap indicating which queue counters are used by a process. Since this is pid-specific, it shouldn't go in the shared shm folder. This means we can't symlink the whole folder, and have to symlink each individual shared counter instead (otherwise the writes would go to the same counter).
I guess an alternative approach would be to add another folder inside the shared shm folder that's indexed by pid, but then the folder structure looks a little silly 8251/pci/02:00.0/8251/rxcounters.counter
with the repeated pid. Or the counter could be stored in a path entirely separate from the stats counters, like 8251/intel_mp_config/02:00.0/rxcounters.counter
. I'm starting to think the latter approach with a new folder might be better now, so I will try it too.
Related with this PR https://github.com/takikawa/snabb/pull/4
This PR revises queue stats handling by providing all the queue stats in each process'
pci
shm directory. It does this by symlinking all of them to a "global" shm sub-directory in the/intel-mp
directory.This doesn't quite solve the
snabb top
issues yet because I think symlinks are not yet followed by some layer thatsnabb top
uses. Oncesnabb top
follows the symlinks it should show all the counters properly. I'm looking into fixing it, but if anyone has any ideas on where to find the cause (maybe at the inotify level?) it would be helpful.Also, this exposes the
rxcounter
andtxcounter
values in an extra counter, which could be used to filter the queue stats shown (since there may be non-zero queue counters of no interest for a given process). This part is WIP but could be useful ifsnabb top
actually looks at it.