Closed oxtoacart closed 10 years ago
Increasing the select_timeout in recvall_from_socket() helped too. With timeouts increased and the conditional settimeout logic, I'm down to around 5% cpu utilization.
Would be good to get independent approval of this, btw, as I'm a little surprised by the size of these gains.
Your analysis is absolutely spot on. In retrospect it's really silly to call settimeout like that. Thanks for spotting that!
See my commit against this issue. Let me know how it compares to what you've implemented. Will keep this open until you confirm.
Ooops. My first commit broke fteproxy on OSX. Try the latest, and let me know if that works for you!
Just built successfully from 8fc0a74 on a Macbook Air running 10.9, and all tests pass with the following results:
> make test
testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok
----------------------------------------------------------------------
Ran 14 tests in 7.672s
OK
Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
+ SUCCESS, format_name="manual-http"
- client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
- server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
- test duration: 6.29 seconds
- goodput: 81.37Mbps
+ SUCCESS, format_name="dummy"
- client-server regex: "^\C+$"
- server-client regex: "^\C+$"
- test duration: 6.79 seconds
- goodput: 75.35Mbps
+ SUCCESS, format_name="manual-smb"
- client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
- server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
- test duration: 6.87 seconds
- goodput: 74.54Mbps
+ SUCCESS, format_name="manual-ssh"
- client-server regex: "^SSH\-2\.0\C*$"
- server-client regex: "^SSH\-2\.0\C*$"
- test duration: 7.25 seconds
- goodput: 70.66Mbps
****** ALL TESTS COMPLETE, SUCCESS!!
When I revert to 9f1b9e7e, the tests give these results:
testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok
----------------------------------------------------------------------
Ran 14 tests in 7.311s
OK
Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
+ SUCCESS, format_name="manual-http"
- client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
- server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
- test duration: 1.34 seconds
- goodput: 95.63Mbps
+ SUCCESS, format_name="dummy"
- client-server regex: "^\C+$"
- server-client regex: "^\C+$"
- test duration: 1.37 seconds
- goodput: 93.66Mbps
+ SUCCESS, format_name="manual-smb"
- client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
- server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
- test duration: 1.45 seconds
- goodput: 88.23Mbps
+ SUCCESS, format_name="manual-ssh"
- client-server regex: "^SSH\-2\.0\C*$"
- server-client regex: "^SSH\-2\.0\C*$"
- test duration: 1.33 seconds
- goodput: 96.12Mbps
****** ALL TESTS COMPLETE, SUCCESS!!
So test durations have increased and goodput went down... maybe that's expected?
When I tested the integration with our FTEMain proxy before and after (by loading (and reloading) our customary heavy Wikipedia page), the new version performed much better, so that's the important thing. Looking better to you, @oxtoacart?
Thanks for the quick fixes on this, @kpdyer!
Yes, just pulled and tested with our FTEMain, and cpu utilization is near what I was seeing with my hacks.
@kpdyer - thanks for the quick turnaround!
Actually, I think I have one more improvement in this space.
@skivvies @oxtoacart Can you try 5eb380e with values of 0.01
or 0.001
for runtime.fte.relay.socket_timeout
in fte/conf.py
?
@kpdyer This last change drops CPU utilization even further! From now on I'm leaving fteproxy bugs open, just to see what you'll come up with.
Just wanted to chime in to say bravo on the great collaboration everyone. Does anyone say "bravo" these days? Somehow makes me feel like I should be retiring =). Good stuff.
Going to set runtime.fte.relay.socket_timeout
to 0.001
, as that works best for me.
Going to close this ticket for now.
I've been doing a little profiling of fteproxy and noticed that in my YouTube streaming test, fteproxy was spending nearly 18% of its time on calling 'socket.settimeout()'.
Looking at the code, I noticed that we have several of these calls on the critical path, such as this one and this one.
For fun, I tried increasing the default timeout from 0.01 to 0.1. This caused my cpu utilization to drop from 35% to 9% while playing a YouTube video.
I then implemented some logic to settimeout only if it's actually changed. That was good for another couple of percent, bumping me down to 7% cpu usage.
Using just the conditional setting trick puts me at about 20% cpu usage.
I don't know what other effects these changes have, but doing something about the excessive settimeout calls might be worth it.