ossf / package-analysis

Open Source Package Analysis
Apache License 2.0
714 stars 51 forks source link

More robust mechanism for getting strace events from gVisor #277

Open calebbrown opened 2 years ago

calebbrown commented 2 years ago
error: "Failed to parse openat args: AT_FDCWD /tmp/pip-install-ok6jlfy9/adafruit-circuitpython-busdevice_f20711ccac244452962292b5236c4d8f, 0x7f176abd3d40 , O_RDONLY|O_CLOEXEC, 0o0) = 0 (0x0) errno=2 (no such file or directory) (2.417µs"

Log entry:

{
  "jsonPayload": {
    "error": "Failed to parse openat args: AT_FDCWD /tmp/pip-install-ok6jlfy9/adafruit-circuitpython-busdevice_f20711ccac244452962292b5236c4d8f, 0x7f176abd3d40 , O_RDONLY|O_CLOEXEC, 0o0) = 0 (0x0) errno=2 (no such file or directory) (2.417µs",
    "timestamp": "2022-05-03T21:57:55.238696724Z",
    "caller": "strace/strace.go:264",
    "message": "Failed to parse syscall"
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "namespace_name": "default",
      "cluster_name": "analysis-cluster",
      "project_id": "ossf-malware-analysis",
      "pod_name": "workers-set-28",
      "location": "us-central1-c",
      "container_name": "worker"
    }
  },
  "timestamp": "2022-05-03T21:57:55.238935409Z",
  "severity": "WARNING",
  "labels": {
  },
  "logName": "projects/ossf-malware-analysis/logs/stderr",
  "sourceLocation": {
    "file": "/src/internal/strace/strace.go",
    "line": "264",
    "function": "github.com/ossf/package-analysis/internal/strace.Parse"
  },
  "receiveTimestamp": "2022-05-03T21:57:56.544994861Z"
}
calebbrown commented 2 years ago

There are a few issue it seems:

  1. empty paths are not handled correctly.
  2. paths with commas are not handled correctly.

Another slightly different example demonstrating the comma issue:

{
  "insertId": "dr8h7nh1k8azcrot",
  "jsonPayload": {
    "timestamp": "2022-05-10T20:33:07.991624188Z",
    "caller": "strace/strace.go:264",
    "message": "Failed to parse syscall",
    "error": "Failed to parse openat args: AT_FDCWD /app, 0x517d230 /app/node_modules/@videojs/http-streaming/node_modules/@videojs/vhs-utils/test/fixtures/formats/webm/av01.0.00M.08.0.110,vorbis.webm, O_WRONLY|O_CLOEXEC|O_CREAT|O_TRUNC, 0o666) = 18 (0x12) (2.911638ms"
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "location": "us-central1-c",
      "namespace_name": "default",
      "container_name": "worker",
      "cluster_name": "analysis-cluster",
      "pod_name": "workers-set-32",
      "project_id": "ossf-malware-analysis"
    }
  },
  "timestamp": "2022-05-10T20:33:07.991874992Z",
  "severity": "WARNING",
  "labels": {
    "compute.googleapis.com/resource_name": "gke-analysis-cluster-default-pool-d94f0570-lqmn",
    "k8s-pod/app": "workers",
    "k8s-pod/statefulset_kubernetes_io/pod-name": "workers-set-32",
    "k8s-pod/controller-revision-hash": "workers-set-5464844d7d"
  },
  "logName": "projects/ossf-malware-analysis/logs/stderr",
  "sourceLocation": {
    "file": "/src/internal/strace/strace.go",
    "line": "264",
    "function": "github.com/ossf/package-analysis/internal/strace.Parse"
  },
  "receiveTimestamp": "2022-05-10T20:33:08.885073572Z"
}
calebbrown commented 2 years ago

The strace parsing from GVisor needs much more improvement to avoid these parsing errors.

The plan is to look into GVisor's support for connecting to the sentry via a UNIX socket (see: https://github.com/google/gvisor/blob/master/pkg/sentry/seccheck/checkers/remote/remote.go)

calebbrown commented 1 year ago

I figured out possible a way to do this. The runsc events command supports the ability to -stream debug events.

By passing -strace and -strace-event to the runsc binary at start the strace events will be sent to a unix socket that can be connected to.

I was able to successfully get this working by manually invoking podman and runsc.

Given that we currently start the container before execing the analysis means we can easily start the strace stream before the analysis.

oliverchang commented 1 year ago

That definitely seems promising! Is there documentation on the format here?

calebbrown commented 1 year ago

I haven't found any documentation, but the code isn't terribly hard to follow (see https://github.com/google/gvisor/blob/master/pkg/eventchannel/processor.go and https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.go).

It's a binary stream consisting of a binary.Uvarint() message length followed by an Any message from google.golang.org/protobuf/types/known/anypb.

The Any message encapsulates a DebugEvent from https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.proto, and the Text field holds an instance of the Strace message from https://github.com/google/gvisor/blob/master/pkg/sentry/strace/strace.proto.

calebbrown commented 1 year ago

For reference, here is the output from runsc events -stream from /bin/bash -c "while true; do echo 'hi' ; done":

...

2022-08-04 08:49:43.044082455 +0000 UTC m=+5.865181584
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:10208}}

2022-08-04 08:49:43.044128928 +0000 UTC m=+5.865228069
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"" args:"0x3" enter:{}}

2022-08-04 08:49:43.044198949 +0000 UTC m=+5.865298085
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:8108}}

...
maxfisher-g commented 1 year ago

Found some documentation here and an example implementation of a monitoring server here