Open 925dk opened 7 years ago
It would be useful if the max TLS record size was customisable via Config.
There was some general discussion about adding knobs in #14376 (although #14376 is about a different problem). I'll turn this into a proposal and we'll see what people think about this.
cc @tombergan @bradfitz
Specifically, the TLS Extension Definition is RFC 6066 "Maximum Fragment Length Negotiation", seen here: https://tools.ietf.org/html/rfc6066#page-8
ping @tombergan does this relate to any of your recent experiments with packet sizes?
It relates but looks different. The feature I added was to automatically use a smaller packet size early in the connection. It looks like @925dk is requesting control over a protocol-level config (MBEDTLS_SSL_MAX_CONTENT_LEN). I think the question of whether or not to support that config is better directed to @agl.
For clarification, this feature just negotiates the maximum size of a fragment (of a packet) between a client and a server. The default fragment size is 2^12, which means that memory-constrained devices must hold a 2^12-byte buffer.
For this to work with golang, the server need only to negotiate the max fragment size (2^9, 2^10, 2^11, 2^12), and then promise to never send a fragment larger than that size.
There's probably more details involved, but that's my basic understanding.
Negotiating the max size would be elegant, but a simple knob for setting it would also do. Either way works for me.
/cc @agl @FiloSottile
Since there's a RFC 6066 extension for this use case already, and it doesn't seem harder to implement that than a Config
knob, I would be for implementing the extension and not exposing any new knob.
Wondering if we should worry about DoS potentials of the lowest values, 2^9 and 2^10. That would be a lot of fragmentation.
@925dk What ballpark sizes does your application need?
@FiloSottile The lower ones 2^9 or 2^10.
@FiloSottile FWIW, see this conversation. Google frontends have been using small record sizes (about 2^10) for the first 1MB of data sent on every connection for years with no known issues. The connection reverts to small records after 1 second of idle time. Given this experience, I wouldn't be concerned about DoS potential for a value of 2^10, unless the server is very highly constrained. I don't know of servers that regularly use packs of size 2^9, so I can't comment on that value.
Yeah, I think we do that at the beginning of a connection already. This would allow an attacker to make it happen for a whole high-bandwidth connection, but I'm not too concerned either. A 2x bandwidth multiplier is not big in terms of intentional DoS. I'll run a benchmark to check the CPU load stays in the same order of magnitude, but then I'd be for implementing the RFC 6066 extension (and might just do it myself).
I think this is waiting on @FiloSottile's test but I don't see any objections being raised, so this seems likely to be accepted.
ping @FiloSottile
Sorry, fell off my radar.
The cost of 2^9 is pretty intense, a 10x drop in throughput. Even 2^10 brings a 5x slowdown.
name old speed new speed delta
Throughput/MaxPacket/1MB-4 40.7MB/s ± 2% 253.9MB/s ± 1% +523.33% (p=0.036 n=5+3)
Throughput/MaxPacket/2MB-4 42.1MB/s ± 6% 307.5MB/s ± 9% +631.19% (p=0.036 n=5+3)
Throughput/MaxPacket/4MB-4 41.9MB/s ± 6% 356.6MB/s ± 3% +751.13% (p=0.036 n=5+3)
Throughput/MaxPacket/8MB-4 42.1MB/s ± 8% 378.6MB/s ± 8% +798.86% (p=0.036 n=5+3)
Throughput/MaxPacket/16MB-4 43.8MB/s ± 6% 425.9MB/s ± 1% +873.11% (p=0.036 n=5+3)
Throughput/MaxPacket/32MB-4 43.2MB/s ± 9% 417.4MB/s ± 9% +865.31% (p=0.036 n=5+3)
Throughput/MaxPacket/64MB-4 43.1MB/s ± 8% 442.2MB/s ± 1% +925.67% (p=0.036 n=5+3)
Throughput/DynamicPacket/1MB-4 41.1MB/s ± 3% 246.7MB/s ± 1% +499.91% (p=0.036 n=5+3)
Throughput/DynamicPacket/2MB-4 41.3MB/s ±10% 316.2MB/s ± 2% +666.25% (p=0.036 n=5+3)
Throughput/DynamicPacket/4MB-4 43.0MB/s ± 5% 366.9MB/s ± 1% +752.23% (p=0.036 n=5+3)
Throughput/DynamicPacket/8MB-4 43.4MB/s ± 7% 402.3MB/s ± 1% +826.56% (p=0.036 n=5+3)
Throughput/DynamicPacket/16MB-4 44.6MB/s ± 1% 422.0MB/s ± 1% +846.46% (p=0.036 n=5+3)
Throughput/DynamicPacket/32MB-4 44.3MB/s ± 2% 439.3MB/s ± 0% +890.77% (p=0.036 n=5+3)
Throughput/DynamicPacket/64MB-4 43.0MB/s ±13% 446.2MB/s ± 0% +938.22% (p=0.036 n=5+3)
Making the benchmarks unidirectional, which seems to match better a server scenario, halves the cost. While I'm not 100% confident in the benchmarks to be honest, as the CPU profile shows 80-90% time in the syscall package, I'm not sure all deployments would be happy with us introducing a client-initiated 5x resource multiplier by default.
It might be more prudent to add a MinimumRecordSize int
Config option, which disables the extension when 0, and allows negotiation when set. (Or just skip the extension and introduce RecordSize int
I suppose.) I can implement either, but for the decision I defer to @agl.
This (RFC 6066) has just been implemented in OpenSSL - https://github.com/openssl/openssl/commit/cf72c7579201086cee303eadcb60bd28eff78dd9. Just wanted to flag, if useful for you guys @agl @FiloSottile.
Thoughts, @agl @FiloSottile @tombergan?
@FiloSottile Any followup on this?
Based on discussion with @agl, we will support the extension with no config option (but not expose a way to request it). We will treat the performance degradation as a performance issue and try to improve the throughput there.
Hey just wanted to poke this thread and say I'd love this feature!
I'm a go dev so I'd be happy to help make it happen with some direction/help :)
+1 to this request. Reasoning:
The "standard" ssh implementation in Linux also has the 16k buffer size. HPN-SSH is a series of patches to OpenSSL, which allow TCP window and the app's (ssh) buffer to grow during large transfers.
We (an ISP) need to transfer terabytes daily. Testing with HPN-SSH showed up to 6.5 times speed increase compared to the standard ssh (tcpdump showed TCP window growing to up to 12Mb during such transfers).
Re: fear of DoSing: it's only us who can DoS ourselves in such setup 😄
Without this feature we'd either have to give up on using Go on this project, or only use Layer-2 protected connections (not all of them are).
CC @golang/security
Doesn't look like this happened for 1.19. Moving to backlog. Please recategorize as appropriate.
The team here at Blues Inc uses (and loves) golang for our cloud service - we're all-in since 2017 - and we use WolfSSL within our device-side firmware (under LwIP+FreeRTOS). FWIW I'd like to reiterate that for multitasked RTOS-based devices with even a moderate amount of memory, it's challenging to ensure that in all cases there's enough free heap to guarantee that 16KB allocations can succeed. As such, RFC6066 support would be tremendously valued so that we can reduce these mallocs to something more readily achievable, such as 8kb. @ianlancetaylor et al, we'd very much appreciate your consideration.
Moving to Go1.23 as there wasn't action for Go1.22
Nothing is happening on this issue. Moving to backlog. Sorry.
Hi,
The default TLS record/chunk size is 16kB - and this is what golang uses (
maxPlaintext
in crypto/tls/common.go I think).It would be useful if the max TLS record size was customisable via
Config
.Use-case is we have memory restricted embedded devices (running mbed TLS) talking TLS to a golang server. The client TLS (receive) buffer needs - in our case - to be less than 16kB to fit in memory - and the server and the client has to agree on this max size.
For mbed TLS this is customisable using
MBEDTLS_SSL_MAX_CONTENT_LEN
.