xrmx / bootchart

merge of bootchart-collector and pybootchartgui
GNU General Public License v2.0
236 stars 88 forks source link

collector's dump doesn't work with ARCH=um kernel #47

Open fingon opened 11 years ago

fingon commented 11 years ago

For some reason, UML kernel images are MMU-enabled, but never have stack entry in /proc//maps. Therefore, the collector's --dump fails at them as they fail to find [stack].

Any chance of adding some other way of dumping state? At a guess, just shared memory, mmap, or something would work with less of a hard dependency on /proc + luck in stuff not breaking.. ;)

(find_chunks in dump.c fails if [stack] occurs on 1kb block's border too, so not very fond of it.)

mmeeks commented 11 years ago

On Tue, 2013-06-04 at 08:36 -0700, Markus Stenberg wrote:

For some reason, UML kernel images are MMU-enabled, but never have stack entry in /proc//maps. Therefore, the collector's --dump fails at them as they fail to find [stack].

Ah - ok :-)

Any chance of adding some other way of dumping state? At a guess, just shared memory, mmap, or something would work with less of a hard dependency on /proc + luck in stuff not breaking.. ;)

Well - the problem is finding the state ;-) we try to use ptrace to

open that to avoid needing to move files around on the file-system etc. particularly across the pivot into the main system (IIRC). I think we could switch back to using files but ...

(find_chunks in dump.c fails if [stack] occurs on 1kb block's border too, so not very fond of it.)

Yep - it's not the world's most beautiful thing for certain-sure ;-) we

could try something yet more evil if that fails; we could try reading around where we think a stack should be - and copying the cookie we look for into some 4k block aligned space:

volatile char magic_buffer[4096*2];

And strcpy the data into that at a block boundary - to make it easier

to (potentially) search any sort of address that the OS might use for stack-ness around there.

Thoughts appreciated, the ptrace approach is already pretty grim so ...

ATB,

    Michael.

michael.meeks@suse.com <><, Pseudo Engineer, itinerant idiot

fingon commented 11 years ago

My preferred choice would be just to provide some way to send signal to collector to dump state (+ possibly exit) - that way, no need for IPC, and handles the general case 'well enough'.

E.g. give dump location on command line when starting collector, then kill - collector => dump occurs (+- exit happens).

fingon commented 11 years ago

Urgh, this eats html-ish less and greater characters, or at least doesn't show them in editor. At any rate, my point was to use kill -SOME_SIGNAL collector to cause the dump (+ possibly exit) to happen.

mmeeks commented 11 years ago

On Tue, 2013-06-04 at 10:20 -0700, Markus Stenberg wrote:

Urgh, this eats html-ish less and greater characters, or at least doesn't show them in editor. At any rate, my point was to use kill -SOME_SIGNAL collector to cause the dump (+ possibly exit) to happen.

Oh - well - that's not a great idea :-)

The problem is that the collector's view of the file-system can be

(IIRC) significantly different from the rest of the system's view of things - ie. it is marooned in some awful place where it gets dumped during the bootstrap / pivot process: along with the initrd etc.

If it was as easy as: fopen ("/tmp/foo"); then we'd all do it -

right ? ;-)

ATB,

    Michael.

michael.meeks@suse.com <><, Pseudo Engineer, itinerant idiot

fingon commented 11 years ago

Ah, true, I don't use initrd so didn't think of that. As collector is parent of the real init, the real init will have pivoted to non-initrd root at some point leaving collector stranded in initrd case.

So I guess POSIX or System V shared memory (shm_open / shmget) would be probably the most portable way to handle it.

Hmm. Starts to sound painful..

fabled commented 7 years ago

Using ptrace is also problematic with grsec kernel, as well as container setups.

shm_open does not likely work because it connects shared memory using files in /dev/shm/. shmget might work as it's implemented on linux kernel level.

How about using UNIX sockets with abstract names? Those are not connected to file systems, and should be pretty universal.

Or binding TCP/IP loopback address? Though, this would need to monitor IP-addresses and bind after loopback IP is configured.