m-lab / traceroute-caller

A sidecar service which runs traceroute after a connection closes
Apache License 2.0
18 stars 5 forks source link

Apply anonymization to traces before writing in triggertrace #154

Closed stephen-soltesz closed 1 year ago

stephen-soltesz commented 1 year ago

This change adds anonymization to the triggertrace handler before of writing trace output. To perform this, this change includes two new methods to the parser interface for Scamper1 and Scamper2 types that Anonymize() the parsed data and MarshalJSONL to return the parsed data back to a []byte for saving.

This is the second of two PRs to add anonymization support to the traceroute-caller.

This change depends on https://github.com/m-lab/go/pull/153 - PR is a draft until that change is merged.


This change is Reviewable

stephen-soltesz commented 1 year ago

@SaiedKazemi this isn't ready for detailed review - but I wanted to share with you to see the shape of things, in case you had early feedback.

SaiedKazemi commented 1 year ago

parser/scamper1_test.go line 42 at r1 (raw file):

      // Read in the test traceroute output file.
      f := filepath.Join("./testdata/scamper1", test.file)
      t.Logf("\nTest %v: file: %v", i, f)

Not sure why this line had to be removed? It makes the logs easier to read in verbose mode.

SaiedKazemi commented 1 year ago

parser/scamper1_test.go line 42 at r1 (raw file):

Previously, SaiedKazemi (Saied Kazemi) wrote…
Not sure why this line had to be removed? It makes the logs easier to read in verbose mode.

Never mind the comment; I see you are using t.Run().

SaiedKazemi commented 1 year ago

@stephen-soltesz I know this PR isn't ready and you sent it to me for early comments (if any). My LGTM means no major comments so far.

SaiedKazemi commented 1 year ago
:lgtm:
stephen-soltesz commented 1 year ago

Thank you!

stephen-soltesz commented 1 year ago

I've verified locally that this works as intended for IPv4. I'll merge this PR and add a new test with a fakeTracer that produces a static result and verifies the result is anonymized correctly. After that I think it would be safe to use in production where needed.