lightstep / lightstep-tracer-go

The Lightstep distributed tracing library for Go
https://lightstep.com
MIT License
98 stars 53 forks source link

Importing lightstep-tracer-go bloats programs and significantly slows down build time #149

Open achille-roussel opened 6 years ago

achille-roussel commented 6 years ago

Hello, I've recently integrated a program with lightstep using the lightstep-tracer-go package and the size of the compiled binary went from 17MB to 24MB and build time increased by over 2x. It seems like the package has lots of dependencies, to give an idea of the situation:

# in lightstep-tracing-go, listing external dependencies
$ govendor list +e | wc -l
      69

Some of those dependencies are sub-packages of grpc and others but still... it's quite a lot.

What's most problematic about this is a lot of this code is unused in the resulting application (because the program will make use of either grpc, thrift, or http to send spans to the collectors), but the compiler cannot eliminate this code since it's not dead code, because the selection of the protocol happens at runtime through the options passed to the tracer when it's created.

Here's an example showing the cost of integrating lightstep-tracer-go in one of or production services (measurements done on OSX 17.5.0 with Go 1.10).

# without lightstep-tracer-go
$ time go build ./cmd/main/

real    0m2.540s
user    0m3.177s
sys 0m0.549s
$ ls -l main
-rwxr-xr-x 1 achille achille 17828770 Jul  7 00:23 main
# with lightstep-tracer-go
$ time go build ./cmd/main/

real    0m5.854s
user    0m1.911s
sys 0m4.510s
$ ls -l main
-rwxr-xr-x 1 achille achille 24043642 Jul  7 00:23 main

The main problem with slowing down build time is it gets in the way of developer productivity, more than doubling is a significant cost that we pay when attempting to iterate quickly on a project. The example I'm giving you is from a project that has already quite a bit of code, to put things in perspective here is the number of lines there were before adding lightstep-tracer-go (but including all other dependencies):

$ loc -u
--------------------------------------------------------------------------------
 Language             Files        Lines        Blank      Comment         Code
--------------------------------------------------------------------------------
 Go                     677       256832        25115        20994       210723
 Markdown                19         2439          537            0         1902
 Perl                     8         1141          142          140          859
 Assembly                23          877          217           24          636
 Bourne Shell             3          787           65          260          462
 JSON                     1          255            0            0          255
 Python                   2          128           24            6           98
 YAML                     4           83           11            6           66
 Makefile                 2           48           11            0           37
 C                        1           47           11            7           29
 JavaScript               2           16            4            0           12
--------------------------------------------------------------------------------
 Total                  742       262653        26137        21437       215079
--------------------------------------------------------------------------------

Then here is the number of lines of code in lightstep-tracer-go (including this package's dependencies):

$ loc -u
--------------------------------------------------------------------------------
 Language             Files        Lines        Blank      Comment         Code
--------------------------------------------------------------------------------
 Go                     463       257034        19460        28965       208609
 Python                  27         4882          847          587         3448
 Makefile                 8         1385          164          103         1118
 C                        2         1257          234          107          916
 Perl                     8         1141          142          140          859
 Assembly                23          877          217           24          636
 Bourne Shell             6          913           80          287          546
 Markdown                12          699          189            0          510
 Protobuf                 8         1730          188         1112          430
 JSON                     1          299            0            0          299
 C/C++ Header             1          247           40           55          152
 Toml                     1           19            4            0           15
 YAML                     1            7            1            0            6
--------------------------------------------------------------------------------
 Total                  561       270490        21566        31380       217544
--------------------------------------------------------------------------------

So adding tracing with lightstep here is doubling the total line count of the program, which aligns with the 2x increase in build time in this case.

I totally get that adding new features requires pulling in a bunch of code, but looking at your implementation it seems like lots could be done to make things better. For example, you're using importing golang.org/x/net/http2 to use HTTP2 over TLS, but my understanding is this is supported by the standard library already and could avoid taking a dependency (I'm totally missing internal context tho so feel free to just let me know what were the design decisions behind that choice). Overall it would be ideal if only the features needed by a lightstep client were to be included in the final compiled program instead of bringing in tons of unnecessary code.

I'd love to get your take on the issue, and I'm available if you need help exploring how to improve the situation.

joeblubaugh commented 6 years ago

Hi @achille-roussel, thanks for doing the timing measurements while reporting this issue. We do still need to support all three transports, but maybe we can move configuration of the transport from run-time to compile time to avoid the large build size for users who don't need it.

We've been batting around one approach here and I'd be interested to hear your opinion on it. We'd create a Transport interface or similar for the Tracer, and move the code that implements this transport interface into independent packages in this library. Initializing the tracer would require a type that implements the Transport interface, which could be imported from one of these packages. Perhaps we'd depend on the HTTP client by default because it has the smallest footprint.

You're right that there are probably other cleanups that we could make that would impact overall LoC size - replacing all instances of x/net/context now that the standard library includes it. I think we would be very open to getting specific issues on the tracer for those individual changes. I think the scope is a little large to discuss them all on this one issue.

achille-roussel commented 6 years ago

Sorry I missed that you had responded here.

Decoupling the tracer and transport logic with some kind of Transport interface (similar to what the net/http package does with Client and Transport) seems like a really good move. It means programs could even provide their own, or wrap the transport to do metric collection or logging at this level, which can be extremely useful for debugging purposes.

I've been collecting more performance profiles on the lightstep tracer recently, I'll open issues/PRs for individual changes.

achille-roussel commented 6 years ago

There's still room for improvement but it's making progress with the two PR you merged.

Before:

-rwxr-xr-x  1 achille achille 21285265 Oct 23 22:16 netc

After

-rwxr-xr-x  1 achille achille 21007845 Oct 23 22:17 netc

Without the lightstep client:

-rwxr-xr-x  1 achille achille 17628361 Oct 23 22:20 netc

Building with Go 1.11 has helped as well, when I initially opened this issue the binary size was 24MB (built with Go 1.10), it went down to 21MB after upgrading the compiler.