eellak / build-recorder

GNU Lesser General Public License v2.1
23 stars 8 forks source link

build-recorder's process hierarchy #225

Open fvalasiad opened 1 month ago

fvalasiad commented 1 month ago

The issue

So far build recorder doesn't distinguish between processes and threads, treating them one and the same.

This decision was made because it fitted the way linux and ptrace(2) worked, each tracee being a distinct thread with its system-wide unique ID.

This though doesn't hold true for other unix-based operating systems, such as FreeBSD, which draws a clear distinction between processes & threads. This makes porting build-recorder as it is right now harder than it potentially needs to be.

Do we really need to have an identifier in our output RDF for each and every thread that runs? Do we really need to include in the output that thread A of a process P clone(2)-ed another thread B? Is it correct to define this parent-child relationship between threads?

When a thread calls an execve(2) variant all the threads of the process are "terminated", yet this isn't really reported anywhere in the output RDF.

My proposal

Either distinguish amongst the two in the output RDF, or stop taking threads into account completely and record processes only.

After all, do we care that the process identified by make -j8 used 8 threads to do its work? Do we care about the specific workload each of those handled?