m-lab / traceroute-caller

A sidecar service which runs traceroute after a connection closes
Apache License 2.0
18 stars 5 forks source link

Add hop annotation and archiving to traceroute-caller #117

Closed SaiedKazemi closed 3 years ago

SaiedKazemi commented 3 years ago

This commit is essentially based on gfr@'s PR#105 and introduces a new cache called HopCache. While the implementation isn't fully complete yet (e.g., the hop cache isn't reset at midnight), it can be reviewed before getting too large.

Passes all tests via "go test ./..." and also works with docker-compose. The annotation files are empty most likely because there is no database on my local workstation.

{
  "Geo": {
    "Missing": true
  },
  "Network": {
    "Missing": true
  }
}

This change is Reviewable

coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 340


Changes Missing Coverage Covered Lines Changed/Added Lines %
connectionlistener/connectionlistener.go 51 57 89.47%
parser/json.go 39 52 75.0%
hopannotation/hopannotation.go 120 137 87.59%
<!-- Total: 250 286 87.41% -->
Files with Coverage Reduction New Missed Lines %
tracer/scamper.go 2 92.47%
parser/json.go 3 76.12%
<!-- Total: 5 -->
Totals Coverage Status
Change from base Build 304: -4.8%
Covered Lines: 727
Relevant Lines: 780

💛 - Coveralls
SaiedKazemi commented 3 years ago

caller.go, line 137 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
FYI xxxCreator is a bad name. It is not creating connections. It is creating connection objects from SockIDs, IIRC.

Yeah, it's a bad name but I chose it because we get this object from NewCreator(). We need to change NewCreator() to something better and also have it return a concrete type as opposed to an interface.

SaiedKazemi commented 3 years ago

caller.go, line 137 at r1 (raw file):

Previously, SaiedKazemi (Saied Kazemi) wrote…
Yeah, it's a bad name but I chose it because we get this object from `NewCreator()`. We need to change `NewCreator()` to something better and also have it return a concrete type as opposed to an interface.

By the way, NewCreator() does not create connection objects from SockIDs. Its comment says NewCreator makes an object that can convert src and dst into local and remote IPs.

SaiedKazemi commented 3 years ago

caller.go, line 106 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
Do we want the ipCache exposed this high up in the call stack? I'm not sure of the answer.

This is the original code. I just changed the name from cache to ipCache for more clarity.

Re whether it should be visible at this level or not, let's consider the question when we refactor TRC. There's also a lot of other questions like this that we need to answer, so (as we discussed) let's limit the scope of current work to adding hop annotations.

SaiedKazemi commented 3 years ago

caller.go, line 140 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
hopAnnotator ? The cache is an implementation detail, so it seems odd to name it that. (Same for ipcache, which we can address later).

Done.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 26 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
Maybe name this connectionEventHandler?

This is the original code. Let's change the names when we do the refactoring.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 43 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
If we move the action to Open, this FromSockID can be moved into the tracing func that I suggest extracting down below. The handler really needn't know anything about connections - just pass the uuid and SockID and let the later code sort it out.

Sounds good, but let's do that when we refactor the code so the scope of this PR remains adding hop annotation functionality.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 55 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
I think now is a fine time to move functionality to Open, but if you prefer we can leave that for next PR.

Yeah, let's leave it for the refactoring PR.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 67 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
Maybe TraceAndAnnotate, or TraceAnnotateArchive?

Done.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 72 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
Grr - whitespace ignored.

Done.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 19 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Please rephrase to "a common location containing API defs". Specifying `m-lab/api` assumes an implementation strategy that has not yet been agreed on.

Done.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 94 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
No wrong answer -- should the call abort or continue? Also, it looks like `ExtractHops` already performs a `net.ParseIP` -- is this second check necessary here?

I fixed the check in ExtractHops() and removed the check here.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 114 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Is this the correct filename format for hop annotations?

Per our conversation, I have changed it to ".json".

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 149 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
We should move toward either separating the generation and the writing, or, inject a Writer into the generator. Injecting a filename is very limiting. There is a m-lab/go/content.Provider, which has been useful for content sources. We could create something similar, but I think WriteCloser interface should be fine. We could implement a gcsWriteCloser that holds a path and other attributes, does nothing until Write is called, and cleans everything up when Close is called. Or we could write Consumer, that just takes a single WriteAll([]byte) and automatically does the create and close and returns any errors.

Acknowledged. Let's discuss interactively and make the change in a subsequent PR.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 148 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
This comment needs to be updated.

Done.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 94 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
I'm confused. Do both functions need to parse the hop IPs? Is this still for debugging?

I removed this but put it back in because package testing failed due to an invalid IP address that was intentionally passed in. More importantly, because this is an exported method and can be called from other callers, it needs to validate its input.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 91 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
This is producing redundant log messages, alongsize connectionlistener.go:104

Error handling needs major refactoring in traceroute-caller. Lower layers should return (not log) errors to higher layers. Higher layers should judiciously decide whether to ignore the error or wrap the error with more context and pass it up the call chain.

As discussed offline, I will refactor error handling

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 114 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
I guess you discussed including the TxxxxxxZ? I think that even though we don't need it, it would be nice to maintain this part of the pattern used in all the other filenames. We don't need the hostname. That is included in the archive name, which is how we handle it for all the other datatypes, IIRC. Discussed with Saied yesterday, but not sure if you talked to Stephen since then. So, I propose _ in other names, ndt-ps5k9_1620781375 which seems quite useful as well. ndt7 also include ndt7-upload, which is probably not necessary, as the pusher should give us that info. So, maybe hop-annotation____.json I think we could reasonably get by with either _ OR the timestamp, but without one of those, it will be harder to disambiguate annotations before and after a reboot. I guess we'd have to use the file creation time, which I hate to rely on.

Let's finalize this today.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 140 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
Create an exported ErrXXX var and use it here. That makes it possible to test against in other code, and helps document all the errors that might be encountered.

There is no need for a new error message. The code should return "err".

SaiedKazemi commented 3 years ago

Will look into this.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 66 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
writes to one or more files.

Done.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener_test.go, line 77 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
This name will come back to bite us. Rename NewFakeLocalIP?

Done.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 27 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
This should be exposed in an export_test.go file,

I don't follow because we don't have export.go. This variable is used in hopannotation/hopannotation_test.go so that AnnotateArchive() would call fakeArchiveHopAnnotation() instead of archiveHopAnnotation().

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 1 at r2 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…
s/handles/handlers/

Done.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 27 at r2 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Go allows two modes: whitebox testing, blackbox testing. whitebox tests use a package name that matches the primary package under test, in this case, `package hopannotation`. blackbox testing would use a similar package name with a suffix `_test`, so `package hopannotation_test`. This mode only makes the exported methods accessible. Go also allows (by convention or specification, I'm not sure) exporting private methods only for testing using a file like `export_test.go`. So, `export_test.go` would import the whitebox package to access the private methods and define new public names for the blackbox tests. If you search for "export_test.go" you can find examples of this in Go system packages. There are many examples within the m-lab org also. https://www.google.com/search?q=golang+export_test.go https://github.com/search?q=org%3Am-lab+filename%3A%22export_test.go%22&type=code I cannot find upstream documentation for this behavior.

I am familiar with whitebox and blackbox testing in Go (package <pkg> vs package <pkg>_test) but wasn't familiar with the export_test.go convention. IIUC, export_test.go is used for exporting private functions of a package for testing (i.e., making them accessible to test code). In our case here, we want to replace a private function with a fake version of it. That said, I will create export_test.go.

SaiedKazemi commented 3 years ago

connectionlistener/connectionlistener.go, line 4 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Yeah, this is more of an `eventsocket.Handler` implementation - https://github.com/m-lab/tcp-info/blob/master/eventsocket/client.go#L27 Maybe `connectionevent.Handler`?

As discussed offline, the three connection packages need refactoring and renaming. Will do this in a separate PR.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 53 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Please see new suggestions in design doc: https://docs.google.com/document/d/1Kh-YbJnZhm1KhcPFo-qN48wdpxYpNsrKzNLvKF0_0UI/edit?pli=1#heading=h.1o8g4cz8a12n

I have made the changes and will share a doc discussing what timestamp we should use for hopannotation1 datatype.

SaiedKazemi commented 3 years ago

parser/json.go, line 19 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
I did a double take here and in hopannotation.go b/c I thought these variables would be `error` types when reading their usage. Why not make them errors?

Done.

SaiedKazemi commented 3 years ago

parser/json.go, line 170 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Some of this parser code may have been copied from the etl parser and contain more general logic than required for the output created by traceroute-caller directly. Does the current implementation of traceroute-caller create JSONL output with a fixed number of objects? e.g. 4? Is 3 an error? If this is necessary, is the UUID field guaranteed to be first?

As discussed offline, I'd also like to simplify this. But let me check with Greg just to be sure and send a separate PR for this.

SaiedKazemi commented 3 years ago

parser/json.go, line 180 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
Again, if the current tool guarantees and output format, then this indexing could be simplified hopefully

Please see my previous comment.

SaiedKazemi commented 3 years ago

Yes, this is unfortunately a large PR because we didn't fully review and merge the individual commits. As discussed offline, I will categorize the changes and will send smaller PRs.

SaiedKazemi commented 3 years ago

hopannotation/hopannotation.go, line 199 at r7 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…
When combining errors, I believe it's preferable to use `%w` so that callers (within this package) could use `errors.Is` to tell if this is a specific error derived from `errCreatePath`. EDIT: I see that `errCreatePath` is a string... The naming made me expect that it was an error type.

Done.

SaiedKazemi commented 3 years ago

Closing this pull request because it has become too large. Will re-partition the changes and send smaller pull requests.