gotd / td

Telegram client, in Go. (MTProto API)
MIT License
1.46k stars 128 forks source link

gen: reduce binary size #87

Closed ernado closed 3 years ago

ernado commented 3 years ago

We have lots of mostly duplicating string literals that bloats binary size:

return fmt.Errorf("unable to decode destroy_session#e7512126: field session_id: %w", err)

This can be reduced, probably as following:

type decodingErr {
   Type string
   TypeID int32
   // Field string?
}
$ go run github.com/jondot/goweight | head -n 10
   45 MB github.com/gotd/td/tg
  6.0 MB runtime
  5.3 MB net/http
  2.4 MB net
  2.4 MB crypto/tls
  2.0 MB reflect
  1.4 MB github.com/gotd/td/internal/mt
  1.3 MB math/big
  1.2 MB go.uber.org/zap/zapcore
  1.2 MB syscall
shadowspore commented 3 years ago

I removed additional error information ('unable to decode', 'field', 'Box', 'as nil', etc.) from gen templates, and the binary size has decreased by ~100kb (without any error wrapping ~400kb). I don't think it's worth it.

misupov commented 3 years ago

Since I'm the one who pointed out this issue on habr.com, I'd like to share my thoughts here. @zweihander, @ernado, I'm a newbie to golang, so I might not know all the stuff, but your comment looks odd to me. How come a simple library weights 45MB? I'm one of the authors of https://github.com/spalt08/mtproto-js which was created as a part of https://github.com/spalt08/telegram-js project. Our goal was to create a tiny web-client, and one of the most important criteria was a size of final bundle. We managed to compact the whole telegram API interaction (113 layer) to 89KB gzipped. (Yes, the fully functioning JavaScript lib for TG api is 89KB, including not only message parser/builder, but also a session level with crypto encoding/decoding, Diffie–Hellman stuff, reconnections, etc.). So when you say 100K is not worthing, I can only say: okay...

ernado commented 3 years ago

Hey @misupov, nice to meet you here :)

The 45mb is size that was reported by binary inspection tool, looks like it does not count compression/de-duplication/etc, I'm not sure why it is so large.

For example, current ./cmd/gotdecho (echo bot implementation) binary on linux/amd64, go 1.15 has size of 11452,252 kb and is statically compiled, i.e. runtime and all dependencies are included and deploy is as simple as one file copy. So it is slightly not fair to compare with gzipped js that will require browser or node.js to run.

So, it is only 11mb, and if we can strip ~400kb, it is just 3-4% loss and is probably not worth it, at least on current stage of development. I'll keep this issue open and we will probably strip 1-2% by reducing string literal duplication.

misupov commented 3 years ago

@ernado nice to meet you too :) Ok, got it. It's still a mistery to me why language that compiles source code to assembler can weight more bytes than JS, but ok. I mean, even if we remove all that Runtime and GC stuff, your implementation will still be havier than JavaScript one.

ernado commented 3 years ago

your implementation will still be heavier than JavaScript one.

The 89KB gzipped code will be ungzipped, interpreted by js engine and JIT-compiled by runtime.

The 11mb binary will just launch on any Linux machine.

Actually I'm missing the whole point of comparing those values, they are like from different worlds.

The main reason why it takes relatively more space is because we are using code generation to provide fully static, type safe implementation for the whole layer of telegram schema. So there are just no runtime overhead, code for serialization and de-serialization is fully generated and extremely fast.

For example, we can encode updateShortMessage#2296d2c8 for just 17ns while having zero heap allocations, it is like ~2.3 gigabyte per second or 58 million operations per second per single core:

func BenchmarkUpdateShortMessage_Encode(b *testing.B) {
    buf := &bin.Buffer{}
    msg := &tg.UpdateShortMessage{
        ID:       123,
        Message:  "Hello there",
        Date:     10041234,
        Pts:      100,
        UserID:   1,
        PtsCount: 35,
    }
    msg.SetMentioned(true)
    // Calculating resulting message size.
    if err := msg.Encode(buf); err != nil {
        b.Fatal(err)
    }
    b.SetBytes(int64(buf.Len()))

    b.ReportAllocs()
    b.ResetTimer()

    for i := 0; i < b.N; i++ {
        buf.Reset()
        if err := msg.Encode(buf); err != nil {
            b.Fatal(err)
        }
    }
}
goos: linux
goarch: amd64
pkg: github.com/gotd/td/internal/proto
BenchmarkUpdateShortMessage_Encode
BenchmarkUpdateShortMessage_Encode-32       70067853            17.0 ns/op  2354.22 MB/s           0 B/op          0 allocs/op
misupov commented 3 years ago

These numbers are impressive! I ran your test for js-client and it's much slower: 2.1 microsecond per message on my machine (althought there's a room for optimization, so i think it can be reduced by 2-5 times - still not 17ns definitelly). But the real aplication that uses your lib will hardly ever face with situation when it needs to parse more that 50-100 objects per second without getting FLOOD_WAIT error. Don't get me wrong, I'm always for performance, and your results are really awesome, but golang nanos are comparable to js micros when it comes to real world cases and we concider network speed, flood-detection, etc.

I really like this project and wish you luck with it.

ernado commented 3 years ago

Don't get me wrong, I'm always for performance, and your results are really awesome, but golang nanos are comparable to js micros when it comes to real world cases and we concider network speed, flood-detection, etc.

Can't agree more, those 17 nanos have no real meaning and I did that benchmark just for fun.

Your project is awesome too! 89kb is impressive results, I love when frontend developers actually care about payload size, it is pretty rare nowadays (and TypeScript is really good).

tie commented 3 years ago

Turns out a lot of unnecessary Encode/Decode methods and type information get’s embedded in the binary when tg.Client and tg.UpdateDispatcher are used as interfaces.

ernado commented 3 years ago

Some additional R&D:

For amd64 on linux binary size for simple bot is just 9.2M (even with tg.Client or dispatcher used as interfaces). Our gotd/bot is 37M, but most of binary size is caused by prometheus and pebble, so without prometheus it is just 11.7M.

Hello world on labstack/echo http framework is 7M, which is pretty close to simple bot size and is less by only 2.2M. The minimum for hello world http server that uses only standard library is 6M.

Basic http server that calls mongo driver to fetch data is 16M, which is already significantly bigger than 9.2M of gotd.

Having such numbers I'm going to conclude that binary size is totally not an issue and possible gains are too minor to care (and also will evaporate when new tdlib-like sugared helpers will be implemented), so closing this until we really eat lots of binary size (like ~20M for simple bot).