kpdyer / fteproxy

programmable proxy for censorship circumvention
https://fteproxy.org/
Apache License 2.0
149 stars 21 forks source link

socket.settimeout() conusming a lot of cpu #116

Closed oxtoacart closed 10 years ago

oxtoacart commented 10 years ago

I've been doing a little profiling of fteproxy and noticed that in my YouTube streaming test, fteproxy was spending nearly 18% of its time on calling 'socket.settimeout()'.

Looking at the code, I noticed that we have several of these calls on the critical path, such as this one and this one.

For fun, I tried increasing the default timeout from 0.01 to 0.1. This caused my cpu utilization to drop from 35% to 9% while playing a YouTube video.

I then implemented some logic to settimeout only if it's actually changed. That was good for another couple of percent, bumping me down to 7% cpu usage.

Using just the conditional setting trick puts me at about 20% cpu usage.

I don't know what other effects these changes have, but doing something about the excessive settimeout calls might be worth it.

oxtoacart commented 10 years ago

Increasing the select_timeout in recvall_from_socket() helped too. With timeouts increased and the conditional settimeout logic, I'm down to around 5% cpu utilization.

oxtoacart commented 10 years ago

Would be good to get independent approval of this, btw, as I'm a little surprised by the size of these gains.

kpdyer commented 10 years ago

Your analysis is absolutely spot on. In retrospect it's really silly to call settimeout like that. Thanks for spotting that!

See my commit against this issue. Let me know how it compares to what you've implemented. Will keep this open until you confirm.

kpdyer commented 10 years ago

Ooops. My first commit broke fteproxy on OSX. Try the latest, and let me know if that works for you!

lanterndev commented 10 years ago

Just built successfully from 8fc0a74 on a Macbook Air running 10.9, and all tests pass with the following results:

> make test
testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok

----------------------------------------------------------------------
Ran 14 tests in 7.672s

OK

Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
 + SUCCESS, format_name="manual-http"
    - client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
    - server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
    - test duration: 6.29 seconds
    - goodput: 81.37Mbps
 + SUCCESS, format_name="dummy"
    - client-server regex: "^\C+$"
    - server-client regex: "^\C+$"
    - test duration: 6.79 seconds
    - goodput: 75.35Mbps
 + SUCCESS, format_name="manual-smb"
    - client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - test duration: 6.87 seconds
    - goodput: 74.54Mbps
 + SUCCESS, format_name="manual-ssh"
    - client-server regex: "^SSH\-2\.0\C*$"
    - server-client regex: "^SSH\-2\.0\C*$"
    - test duration: 7.25 seconds
    - goodput: 70.66Mbps
****** ALL TESTS COMPLETE, SUCCESS!!

When I revert to 9f1b9e7e, the tests give these results:

testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok

----------------------------------------------------------------------
Ran 14 tests in 7.311s

OK
Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
 + SUCCESS, format_name="manual-http"
    - client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
    - server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
    - test duration: 1.34 seconds
    - goodput: 95.63Mbps
 + SUCCESS, format_name="dummy"
    - client-server regex: "^\C+$"
    - server-client regex: "^\C+$"
    - test duration: 1.37 seconds
    - goodput: 93.66Mbps
 + SUCCESS, format_name="manual-smb"
    - client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - test duration: 1.45 seconds
    - goodput: 88.23Mbps
 + SUCCESS, format_name="manual-ssh"
    - client-server regex: "^SSH\-2\.0\C*$"
    - server-client regex: "^SSH\-2\.0\C*$"
    - test duration: 1.33 seconds
    - goodput: 96.12Mbps
****** ALL TESTS COMPLETE, SUCCESS!!

So test durations have increased and goodput went down... maybe that's expected?

When I tested the integration with our FTEMain proxy before and after (by loading (and reloading) our customary heavy Wikipedia page), the new version performed much better, so that's the important thing. Looking better to you, @oxtoacart?

Thanks for the quick fixes on this, @kpdyer!

oxtoacart commented 10 years ago

Yes, just pulled and tested with our FTEMain, and cpu utilization is near what I was seeing with my hacks.

@kpdyer - thanks for the quick turnaround!

kpdyer commented 10 years ago

Actually, I think I have one more improvement in this space.

@skivvies @oxtoacart Can you try 5eb380e with values of 0.01 or 0.001 for runtime.fte.relay.socket_timeout in fte/conf.py?

oxtoacart commented 10 years ago

@kpdyer This last change drops CPU utilization even further! From now on I'm leaving fteproxy bugs open, just to see what you'll come up with.

myleshorton commented 10 years ago

Just wanted to chime in to say bravo on the great collaboration everyone. Does anyone say "bravo" these days? Somehow makes me feel like I should be retiring =). Good stuff.

kpdyer commented 10 years ago

Going to set runtime.fte.relay.socket_timeout to 0.001, as that works best for me.

Going to close this ticket for now.