psanford / wormhole-william

End-to-end encrypted file transfer. A magic wormhole CLI and API in Go (golang).
MIT License
1.07k stars 54 forks source link

Audit usages of runtime reflection #83

Closed Jacalz closed 2 years ago

Jacalz commented 2 years ago

The reflect package in Go is slow as it has to do runtime reflection on the types. This also has the added negative effect that binaries generally get bigger due to having to embed more type data than what otherwise would have been necessary. Using reflect is generally something that, like the unsafe package, should be avoided where possible.

I have looked at the code and some of the usages should be possible to remove by creating interfaces with getters and setters instead. For the main user-facing API, the usages are in these files:

For testing:

Removing the dependency on reflect should result in smaller binaries with generally big performance improvements for the parts where reflect was being used.

psanford commented 2 years ago

We use encoding/json which is all based on reflect. I don't have much appetite for replacing that with a non-reflection based package.

I am not likely to accept performance optimizations without profiling data from realistic workloads demonstrating that this change actually makes any difference. I am pretty skeptical that reflect is in any way our bottleneck. I would expect a profile of a normal file transfer to be dominated in wall time by network latency and in cpu time by the cryptography operations.

Jacalz commented 2 years ago

To begin, I just want to say that I am not suggesting that we use a different package than encoding/json, because that use of reflection makes a lot of sense there (to be honest, I actually prefer it to non reflection based approaches in some other languages).

I think there has been a misunderstanding here. This issue is purely focused on the usages within this codebase and how some of them actually might make more sense without reflect (even if the reflection was without the mentioned drawbacks). Reflection generally being slow is a fact but not necessarily the reason for why I think we should avoid it where we can. The reflect package should be used sparingly and saved for where it makes sense.

That being said, there are often better ways to implement algorithms, without using reflect. There is no reason to use reflection if it can be done just as good or better without it (see https://en.wikipedia.org/wiki/Law_of_the_instrument).

As I have demonstrated with #84, all of the usages within client.go could be changed to regular, readable and clear Go code without having to rely on anything more special. As an example, reading the value from the rendezvous_value field and setting that data in the field did not seem like a good practice to me.

I don't mean to cause any offence with this, but to be honest, it felt more like code from a higher level language than Go-code in my opinion.

That PR is not about performance but more about using other language features to implement the functionality with more standardised (there is a better name for it but I can't remove it at the moment) and readable code. There are certainly usages in the tests where reflect makes sense to use but I don't think it makes sense everywhere. We should audit where it is being used and remove it where it does not make sense.

Jacalz commented 2 years ago

I understand why the misunderstanding came to be. The title should perhaps have been worded differently and the summary be made a little bit clear.

The title has now been changed and I marked one of the codes as completed because that usage makes sense to me. The rendezvousservertest usage only really should drop rendezvous_value when it isn't used elsewhere in #84 but it is certainly no necessity.

psanford commented 2 years ago

If the primary reason for the change is code aesthetics, then we're basically just talking about our preferred taste when it comes to coding style. I think we're just going to disagree about this.

Using struct tags is quite common in Go, especially for metadata about serialization. I think most Go developers will be quite comfortable with how that works. But also, most developers won't ever look at this code since there's not much reason to interact directly with the mailbox server unless you are implementing a non-wormhole protocol using the mailbox server. I'm not aware of any such implementations in Go today.

Your PR diffstat is -168 +337 for a net total addition of 169 lines which is basically all boilerplate. Having that code duplicated to every message type does not seem like an improvement to me.

I'm fairly happy with how the code works today so I don't think I'm likely to accept this change.

Jacalz commented 2 years ago

Alright then. I do not agree but I will not take this any further. Thanks for your answers.