containers / crun

A fast and lightweight fully featured OCI runtime and C library for running containers
GNU General Public License v2.0
2.97k stars 300 forks source link

postStart hook may run before startContainer hook #1562

Open ningmingxiao opened 5 days ago

ningmingxiao commented 5 days ago

I find postStart hook may run before startContainer hook when I review crun code. @giuseppe

giuseppe commented 5 days ago

can you provide more details?

It depends from the definition of "started". It can happen before the execv, but at that point the container is already considered started

ningmingxiao commented 4 days ago

I add some log libcrun_debug ("nmx001 run poststart");

  /* The container is considered running only after we got the notification from the
     notify_socket, if any.  */
  if (def->hooks && def->hooks->poststart_len)
    {
      cleanup_close int hooks_out_fd = -1;
      cleanup_close int hooks_err_fd = -1;

      ret = open_hooks_output (container, &hooks_out_fd, &hooks_err_fd, err);
      if (UNLIKELY (ret < 0))
        return ret;
      libcrun_debug ("nmx001 run poststart"); 
      ret = do_hooks (def, status.pid, context->id, true, status.bundle, "running", (hook **) def->hooks->poststart,
                      def->hooks->poststart_len, hooks_out_fd, hooks_err_fd, err);
      if (UNLIKELY (ret < 0))
        crun_error_release (err);
    }

and add a sleep before startcontainer hook( let hook run lalter).

  sleep(30);
  if (def->hooks && def->hooks->start_container_len)
    {
      libcrun_container_t *container = entrypoint_args->container;

      ret = do_hooks (def, 0, container->context->id, false, NULL, "starting", (hook **) def->hooks->start_container,
                      def->hooks->start_container_len, entrypoint_args->hooks_out_fd, entrypoint_args->hooks_err_fd,
                      err);
      if (UNLIKELY (ret != 0))
        return ret;

      /* Seek stdout/stderr to the end.  If the hooks were using the same files,
         the container process overwrites what was previously written.  */
      (void) lseek (1, 0, SEEK_END);
      (void) lseek (2, 0, SEEK_END);
    }
[root@LIN-FB738BFD367 mycontainer]# crun --debug create nmx002
2024-09-13T08:39:39.633047Z: Using debug verbosity
2024-09-13T08:39:39.633169Z: Loading container from config file: config.json
2024-09-13T08:39:39.633399Z: Using bundle: /mycontainer
2024-09-13T08:39:39.633411Z: Creating container: nmx002
2024-09-13T08:39:39.633432Z: Checking run directory: /run/crun
2024-09-13T08:39:39.633473Z: Creating exec fifo: /run/crun/nmx002/exec.fifo
2024-09-13T08:39:39.633495Z: Running with prefork enabled
2024-09-13T08:39:39.633507Z: Reading config file: config.json
2024-09-13T08:39:39.633551Z: Writing config file to: /run/crun/nmx002/config.json
2024-09-13T08:39:39.633614Z: Opening hooks output
2024-09-13T08:39:39.633629Z: Creating new keyring
2024-09-13T08:39:39.633654Z: Using cgroupfs cgroup manager
2024-09-13T08:39:39.633665Z: Using container host UID 0 and GID 0
2024-09-13T08:39:39.633692Z: Running linux container
2024-09-13T08:39:39.633704Z: Unsharing namespace: pid
2024-09-13T08:39:39.633711Z: Unsharing namespace: network
2024-09-13T08:39:39.633716Z: Unsharing namespace: ipc
2024-09-13T08:39:39.633721Z: Unsharing namespace: uts
2024-09-13T08:39:39.633727Z: Unsharing namespace: mount
2024-09-13T08:39:39.633754Z: Set rlimit: soft = 1024, hard = 1024
2024-09-13T08:39:39.636070Z: Running container on PID: 2461070
2024-09-13T08:39:39.659979Z: Writing container status
[root@LIN-FB738BFD367 mycontainer]# 
[root@LIN-FB738BFD367 mycontainer]# 
[root@LIN-FB738BFD367 mycontainer]# date && crun --debug start nmx002
2024年 09月 13日 星期五 16:39:59 CST
2024-09-13T08:39:59.262659Z: Using debug verbosity
2024-09-13T08:39:59.263438Z: Loading container from config file: /run/crun/nmx002/config.json
2024-09-13T08:39:59.264006Z: Opening hooks output
2024-09-13T08:39:59.264012Z: nmx001 run poststart
[root@LIN-FB738BFD367 mycontainer]# sh: can't access tty; job control turned off
/ # 

poststart doesn't run after sleep 30 (because time is 08:39:59.26 )

ningmingxiao commented 4 days ago

I think "crun create" should tell "crun start" after "crun create " run start_container hook and use exec.sock instead of exec.fifo.

giuseppe commented 4 days ago

there is no expectation (at least from what is specified in the specs) that postStart hooks happen after the startContainer hooks. They happen from different environments, postStart is called from the runtime environment while startContainer happens from the container