Open calebbrown opened 2 years ago
There are a few issue it seems:
Another slightly different example demonstrating the comma issue:
{
"insertId": "dr8h7nh1k8azcrot",
"jsonPayload": {
"timestamp": "2022-05-10T20:33:07.991624188Z",
"caller": "strace/strace.go:264",
"message": "Failed to parse syscall",
"error": "Failed to parse openat args: AT_FDCWD /app, 0x517d230 /app/node_modules/@videojs/http-streaming/node_modules/@videojs/vhs-utils/test/fixtures/formats/webm/av01.0.00M.08.0.110,vorbis.webm, O_WRONLY|O_CLOEXEC|O_CREAT|O_TRUNC, 0o666) = 18 (0x12) (2.911638ms"
},
"resource": {
"type": "k8s_container",
"labels": {
"location": "us-central1-c",
"namespace_name": "default",
"container_name": "worker",
"cluster_name": "analysis-cluster",
"pod_name": "workers-set-32",
"project_id": "ossf-malware-analysis"
}
},
"timestamp": "2022-05-10T20:33:07.991874992Z",
"severity": "WARNING",
"labels": {
"compute.googleapis.com/resource_name": "gke-analysis-cluster-default-pool-d94f0570-lqmn",
"k8s-pod/app": "workers",
"k8s-pod/statefulset_kubernetes_io/pod-name": "workers-set-32",
"k8s-pod/controller-revision-hash": "workers-set-5464844d7d"
},
"logName": "projects/ossf-malware-analysis/logs/stderr",
"sourceLocation": {
"file": "/src/internal/strace/strace.go",
"line": "264",
"function": "github.com/ossf/package-analysis/internal/strace.Parse"
},
"receiveTimestamp": "2022-05-10T20:33:08.885073572Z"
}
The strace parsing from GVisor needs much more improvement to avoid these parsing errors.
The plan is to look into GVisor's support for connecting to the sentry via a UNIX socket (see: https://github.com/google/gvisor/blob/master/pkg/sentry/seccheck/checkers/remote/remote.go)
I figured out possible a way to do this. The runsc events
command supports the ability to -stream
debug events.
By passing -strace
and -strace-event
to the runsc
binary at start the strace events will be sent to a unix socket that can be connected to.
I was able to successfully get this working by manually invoking podman
and runsc
.
Given that we currently start
the container before exec
ing the analysis means we can easily start the strace stream before the analysis.
That definitely seems promising! Is there documentation on the format here?
I haven't found any documentation, but the code isn't terribly hard to follow (see https://github.com/google/gvisor/blob/master/pkg/eventchannel/processor.go and https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.go).
It's a binary stream consisting of a binary.Uvarint()
message length followed by an Any
message from google.golang.org/protobuf/types/known/anypb
.
The Any
message encapsulates a DebugEvent
from https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.proto, and the Text
field holds an instance of the Strace
message from https://github.com/google/gvisor/blob/master/pkg/sentry/strace/strace.proto.
For reference, here is the output from runsc events -stream
from /bin/bash -c "while true; do echo 'hi' ; done"
:
...
2022-08-04 08:49:43.044082455 +0000 UTC m=+5.865181584
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:10208}}
2022-08-04 08:49:43.044128928 +0000 UTC m=+5.865228069
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"" args:"0x3" enter:{}}
2022-08-04 08:49:43.044198949 +0000 UTC m=+5.865298085
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:8108}}
...
Log entry: