rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
505 stars 364 forks source link

RPM must kill all childs on exit (from section?) #134

Closed ignatenkobrain closed 6 years ago

ignatenkobrain commented 7 years ago

Imagine, we have some stupid program which forks and doesn't exit until we send SIGTERM/SIGKILL, so build finishes, but process is still running. easy way to reproduce is just run ./hang & from %check.

I think @voxik has some real example in mongodb build while I have only dummy one:

hang.c:

#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

volatile sig_atomic_t done = 0;

static void
term (int signum)
{
  ;
  //done = 1;
}

int
main (int   argc,
      char *argv[])
{
  struct sigaction action;
  memset (&action, 0, sizeof (struct sigaction));
  action.sa_handler = term;
  sigaction (SIGTERM, &action, NULL);

  int loop = 0;             
  while (!done)                 
    {                                       
      int t = sleep (3);                                    
      /* sleep returns the number of seconds left if interrupted */
      while (t > 0)                                                 
        {                                                                           
          printf ("Loop run was interrupted with %d "                                                   
                  "sec to go, finishing...\n", t);                                                                                         
          t = sleep(t);
        }                                                                                                                   
      printf("Finished loop run %d.\n", loop++);                            
    }                                                                           

  printf("done.\n");                
  return 0;                             
}

hang.spec:

Name:           hang
Version:        1
Release:        1%{?dist}
Summary:        Hanging build

License:        Public Domain
URL:            https://fedoraproject.org
Source0:        hang.c

BuildRequires:  gcc

%description
%{summary}.

%prep
%autosetup -c -T

%build
gcc %{__global_cflags} %{__global_ldflags} %{S:0} -o hang

%check
./hang &

%files

%changelog

I think easiest way is to catch all childs and send SIGKILL, but if we want to send SIGTERM, we will need to try sending SIGTERM, wait some time, check for still-alive processes and send SIGKILL.

voxik commented 7 years ago

This is the use case:

http://pkgs.fedoraproject.org/cgit/rpms/rubygem-mongo.git/tree/rubygem-mongo.spec#n79

Please note it works just fine with rpm build or when using mock with (--old-)chroot on the backend

voxik commented 7 years ago

But honestly, I am no sure why/how this should be handled on RPM side ...

ignatenkobrain commented 7 years ago

@voxik try to compile package as I showed above. You run something in background from RPM, you get RPMs, but process is still running which is IMO not acceptable.

pmatilai commented 7 years ago

I tend to agree with @ignatenkobrain, no processes should be left around once build completes.

pmatilai commented 6 years ago

A simple way to achieve this would be running the build-scripts inside a transient systemd unit. This is enough to make the reproducer to not hang around:

+++ b/macros.in
@@ -792,9 +792,10 @@ package or when debugging this package.\
 #      Global defaults used for building scriptlet templates.
 #

+%___build_session       systemd-run --pipe --user --wait --quiet
 %___build_shell                %{?_buildshell:%{_buildshell}}%{!?_buildshell:/bin/sh}
 %___build_args         -e
-%___build_cmd          %{?_sudo:%{_sudo} }%{?_remsh:%{_remsh} %{_remhost} }%{?_remsudo:%{_remsudo} }%{?_remchroot:%{_remchroot} %{_remroot} }%{___build_shell} %{___build_args}
+%___build_cmd          %{?_sudo:%{_sudo} }%{?_remsh:%{_remsh} %{_remhost} }%{?_remsudo:%{_remsudo} }%{?_remchroot:%{_remchroot} %{_remroot} }%{?___build_session} %{___build_shell} %{___build_args}
 %___build_pre  \
   RPM_SOURCE_DIR=\"%{u2p:%{_sourcedir}}\"\
   RPM_BUILD_DIR=\"%{u2p:%{_builddir}}\"\

OTOH one could just run rpmbuild itself under systemd-run for the same effect and with a zero-length diff...

voxik commented 6 years ago

If I remember correctly, I met this issue running rpmbuild inside mock, which is actually using systemd-nspawn, which complained during the container shutdown. Moreover, I think that systemd-run is not available inside mock, so I am not convinced this is the right solution.

pmatilai commented 6 years ago

Well I'm not seriously suggesting dragging in systemd for a rare corner case issue. OTOH we're not going to catch all children launched from a shell we invoked without something more advanced and non-portable - all of those are likely to look like serious overkill anyway.

We could of course do something like this:

-%___build_post         exit $?
+%___build_post \
+  RPM_EC=$?\
+  for pid in $(jobs -p); do kill ${pid}; done\
+  exit ${RPM_EC}\
+%{nil}

...which won't catch a process that forked into background by itself, but it'd catch anything invoked with &. Which would be better than nothing I guess.

voxik commented 6 years ago

Just two ideas:

  1. What if there was something like begin, ensure, end block? I.e. I could ensure that the executed process is really terminated, although the test suite failed.
  2. What if I could somehow register/execute some process in background using some RPM framework? I.e. RPM would be aware there is some process running which should be terminated.

These are not universal

cgwalters commented 6 years ago

Keep in mind that if rpmbuild itself used container features it'd heavily overlap with tools like mock - and while recursive containerization does work, it really requires both ends to be ready for it.

See also https://github.com/projectatomic/bubblewrap/issues/284

pmatilai commented 6 years ago

It's just a plain old shell executing whatever was in the build script, it doesn't know anything about rpm or blocks. Also any co-operative scheme is a fail, what we'd like to catch is accidents.

As of commit 06953879d3e0b1e9a434979056d1225ab4646142 rpm kills any background jobs left from build-scripts, that's the best we can do without venturing into Linux-specific container etc business. For further protection, run builds under something like systemd-run which can actually track all the processes (via cgroups or whatever).