pachyderm / pachyderm

Data-Centric Pipelines and Data Versioning
https://www.pachyderm.com/
Apache License 2.0
6.19k stars 566 forks source link

bad pipeline JSON returns cryptic error #2651

Open sjezewski opened 6 years ago

sjezewski commented 6 years ago

I realized the pipeline definition I had needed to have stdin defined as an array. However until then, I would just get this message when trying to create pipeline:

malformed pipeline spec: json: cannot unmarshal string into Go value of type []json.RawMessage

Ideally we report something more meaningful to the user.

Now, debugging the pipeline spec by binary searching yielded the line in question, but right now bcz of #2650 that results in a bunch of pachd restarts.

ryanberckmans commented 6 years ago

Similarly where this json is missing the spec.pipeline

pachctl create-pipeline -f missing-pipeline-section.json

pachctl output is rpc error: code = Unavailable desc = transport is closing and pachd panics

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1736e40]

goroutine 282 [running]:
github.com/pachyderm/pachyderm/src/server/pps/server.(*apiServer).validatePipeline(0xc420212c60, 0x7fd03ef1d870, 0xc42086ae70, 0xc420876140, 0xbe94344259954a3e, 0x6a2701e55)
        /go/src/github.com/pachyderm/pachyderm/src/server/pps/server/api_server.go:1169 +0x60
github.com/pachyderm/pachyderm/src/server/pps/server.(*apiServer).CreatePipeline(0xc420212c60, 0x7fd03ef1d870, 0xc42086ae70, 0xc42060c840, 0x0, 0x0, 0x0)
        /go/src/github.com/pachyderm/pachyderm/src/server/pps/server/api_server.go:1391 +0x4f5
github.com/pachyderm/pachyderm/src/server/vendor/github.com/pachyderm/pachyderm/src/client/pps._API_CreatePipeline_Handler(0x1bd04e0, 0xc420212c60, 0x7fd03ef1d870, 0xc42086ae70, 0xc4202c1090, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/src/github.com/pachyderm/pachyderm/src/server/vendor/github.com/pachyderm/pachyderm/src/client/pps/pps.pb.go:3143 +0x276

Which in 1.6.7 is https://github.com/pachyderm/pachyderm/blob/master/src/server/pps/server/api_server.go#L1200

ysimonson commented 4 years ago

Repro'd on 1.10, with a fairly simple and innocuous looking pipeline spec:

{
  "pipeline": {
    "name": "fuzz_extract_restore_output"
  },
  "input": {
    "pfs": {
      "glob": "/*",
      "repo": "fuzz_extract_restore_input"
    }
  },
  "transform": {
    "cmd": ["bash"],
    "stdin": "cp /pfs/fuzz_extract_restore_input/* /pfs/out/"
  }
}

the problem here being that transform.stdin should be an array of strings.