Roblox / nomad-driver-containerd

Nomad task driver for launching containers using containerd.
Other
214 stars 35 forks source link

Cannot launch task: stdout.fifo and stderr.fifo already closed #127

Open michaelerickson opened 2 years ago

michaelerickson commented 2 years ago

Hello,

I'm trying to get this driver to work with a sample go program that just listens on an http port and prints a message. The task won't launch and looking through the logs I see:

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.287-0600 [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: failed to read from log fifo: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello @module=logmon error="read /opt/nomad/alloc/2e96234e-24b1-8a05-77f2-6e6620986232/alloc/logs/.c-hello.stdout.fifo: file already closed" timestamp=2022-02-11T12:06:22.286-0600

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.287-0600 [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: failed to read from log fifo: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello @module=logmon error="read /opt/nomad/alloc/2e96234e-24b1-8a05-77f2-6e6620986232/alloc/logs/.c-hello.stderr.fifo: file already closed" timestamp=2022-02-11T12:06:22.286-0600

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.294-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.stdio: received EOF, stopping recv loop: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello err="rpc error: code = Unavailable desc = error reading from server: EOF"
Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello path=/usr/local/bin/nomad pid=74844

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin exited: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello

I have verified that the image runs using nerdctl. It also runs using the Nomad docker and podman task drivers. I was able to launch the redis example using the driver, so I feel like the driver is generally working. Any help or pointers would be greatly appreciated.

Details:

Job File:

job "containerd" {
  datacenters = ["dc1"]

  group "c-service" {
    network {
      port "http" {
        to = 8080
      }
    }
    service {
      name = "c-service"
      tags = ["urlprefix-/"]
      port = "http"

      check {
        type = "http"
        path = "/health"
        interval = "2s"
        timeout  = "2s"
      }
    }

    task "c-hello" {
      driver = "containerd-driver"

      config {
        image = "docker.io/michaelerickson/go-hello-docker:latest"
        host_network = true
        # ports = ["web"]
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

The code for the service I'm trying to launch (go v 1.17):

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "net"
    "net/http"
    "os"

    "github.com/gorilla/mux"
)

// serviceStatus represents the health of our service
type serviceStatus struct {
    Status string
}

// loggingMiddleware logs all requests to our service
func loggingMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        log.Printf("%s %s", r.Method, r.RequestURI)
        next.ServeHTTP(w, r)
    })
}

// notAllowedHandler is called for all requests that are not specifically
// handled. It returns HTTP not allowed
func notAllowedHandler(w http.ResponseWriter, r *http.Request) {
    log.Printf("%s %s method not allowed", r.Method, r.RequestURI)
    http.Error(w, "Not Allowed", http.StatusMethodNotAllowed)
}

// healthCheckHandler responds to /health and verifies that the service is up
func healthCheckHandler(w http.ResponseWriter, _ *http.Request) {
    status := serviceStatus{Status: "OK"}
    response, err := json.Marshal(status)
    if err != nil {
        log.Printf("JSON error: %s", err)
        http.Error(w, "JSON error", http.StatusInternalServerError)
        return
    }
    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusOK)
    w.Write(response)
}

// rootHandler responds to /
func rootHandler(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    srvAddr := ctx.Value(http.LocalAddrContextKey).(net.Addr)
    response := fmt.Sprintf("Hello, Docker! from: %s\n", srvAddr)
    w.Write([]byte(response))
}

func main() {
    httpPort := os.Getenv("HTTP_PORT")
    if httpPort == "" {
        httpPort = "8080"
    }

    log.Printf("Starting echo service on %s", httpPort)

    r := mux.NewRouter()

    r.HandleFunc("/health", healthCheckHandler)
    r.HandleFunc("/", rootHandler)
    r.Use(loggingMiddleware)

    log.Fatal(http.ListenAndServe(":"+httpPort, r))
}

The dockerfile that builds the image:

# syntax=docker/dockerfile:1

# Multistage build to generate the smallest possible runtime image.

##
## BUILD
##
FROM golang:1.17.6-bullseye AS build

WORKDIR /app

COPY go.mod ./
COPY go.sum ./

RUN go mod download

COPY *.go ./

# Build for linux-arm64
RUN CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -o /docker-gs-ping

##
## Deploy
##
FROM gcr.io/distroless/static

COPY --from=build /docker-gs-ping /docker-gs-ping

EXPOSE 8080

USER nonroot:nonroot

ENTRYPOINT ["/docker-gs-ping"]
shishir-a412ed commented 2 years ago

@michaelerickson Let me check.

shishir-a412ed commented 2 years ago

@michaelerickson I am trying to reproduce this on my Mac x86_64. I am trying to launch a vagrant VM with Debian 5.10.92-1 arm64 but I am not able to find a vagrant image for this. I am looking here: https://app.vagrantup.com/boxes/search

Do you have a vagrant box I could use to reproduce this issue?