Closed igrigorik closed 14 years ago
Please try again with 0.6.6, setting the new attribute Curl::Multi.default_timeout = 1
Curl version: * $Id: curl.h,v 1.379 2009-03-02 23:05:31 bagder Exp $
Default timeout appears to be set at 100ms. Applied the patch, set timeout to 1ms, running now - will take a bit of time to get a better picture. Just eyeballing the last minute of stats though, it does seem that it's going faster, but still doesn't match the 0.4.6.x branches.
Will try reverting back a few commits to see where the problem is introduced:
http://github.com/taf2/curb/commit/1be75e3dee822e5e760d21468d1d7b3b0472fcf4 -> use an internal Hash instead of static variables to store ruby objects... the thought here was it should speed up Curl::Easy.new -> yet to be proven though.. http://github.com/taf2/curb/commit/4bdc81576ae7483f130a4eec17c73373927614fa -> timeout refactored Other commits that may have impacted performance: http://github.com/taf2/curb/commit/35cb65b03f7b386fe6c7f7068b7ac7622c6e1820 http://github.com/taf2/curb/commit/6af516cb31909610f4e6b9f4302aeb49bfaeed49 This one maybe if you're not running the loop through a multiple handle and not running multiple threads... http://github.com/taf2/curb/commit/d0944bca9a573fa8bbe2f7691d421516738a25bb
Nope, must be something else - throughput stayed the same after applying the timeout patch.
Ok, ran bisect on it, and it's definitely the "cleanup timeout commit":
4bdc81576ae7483f130a4eec17c73373927614fa is first bad commit commit 4bdc81576ae7483f130a4eec17c73373927614fa Author: Todd FisherDate: Sun Jul 26 16:31:52 2009 -0400 cleanup timeout code in multi select loop :040000 040000 449d377da6cad320dca006253e2318ffb5e792e7 a7b14dc5524c6dfb6577df3214aabbb97ce69665 M ext
bisect run:
git bisect start # bad: [ac0b465392429caaaec7ca2eadf98fe2c05603cb] ready for release git bisect bad ac0b465392429caaaec7ca2eadf98fe2c05603cb # good: [6c133998302ea9ad0c5146880cbd7fa8585edf6e] push out new gem release git bisect good 6c133998302ea9ad0c5146880cbd7fa8585edf6e # bad: [1745d46fcf1e5c55276be558b64ed0644ed612a2] compat with non GNU strip binaries git bisect bad 1745d46fcf1e5c55276be558b64ed0644ed612a2 # bad: [a46607c037576bcca3c782ee32159f9dd46f5e10] update other request types to yield once git bisect bad a46607c037576bcca3c782ee32159f9dd46f5e10 # bad: [4bdc81576ae7483f130a4eec17c73373927614fa] cleanup timeout code in multi select loop git bisect bad 4bdc81576ae7483f130a4eec17c73373927614fa # good: [c351e53e30eff1eae119fbd79d54b37da64930cc] make sure to set CURLOPT_NOBODY as reported by copiousfreetime git bisect good c351e53e30eff1eae119fbd79d54b37da64930cc # good: [ab433db38df1e86cfc0261796a96fe1cb3cde209] add another test case for 500 responses git bisect good ab433db38df1e86cfc0261796a96fe1cb3cde209 # good: [4233fcc6f65f1107693014038da9320cbdee5c5e] set upload object to Qnil on cleanup git bisect good 4233fcc6f65f1107693014038da9320cbdee5c5e # good: [50bdc2beed1c89c903145c05cc00344b9a7ac9da] add inspect method git bisect good 50bdc2beed1c89c903145c05cc00344b9a7ac9da
Latest patches did not fix (tried setting default_timeout to 1ms, same results).
I know it's been a while, but wondering if you can try with this last commit. http://github.com/taf2/curb/commit/f001b0ead7d7df1f2289087e8527d4d1183a4b6d
taf2 I have made some changes to my fork for the Curl::Multi These changes have made the running of tests significantly faster. Almost 2-3 seconds to run all the units tests, which used to take 8-9 seconds.
You can have a look at this patch.
http://github.com/ramsingla/curb/commit/d56853bd435c71f4a68a1557bf57d3a1b96a5698
There are couple of things regarding the timeout I had change that could be kept the same. I am sending you the pull request.
Thanks! I'll review tonight
Also this patch provides the issue with adding more requests to the same multi handler on_complete callbacks dynamically. Often it was observed that the requests where not performed. This is important in the application when some serialization is required amongst some requests. In these cases, the perform call had to be called again for the requests which imho is not desired behavior.
Okay, after applying the patch my tests went from taking 4.2 seconds to 5.2 seconds... not sure test speed in this case is a good measure... but what I do see is we're not relying on libcurl to tell us how long to wait... that does make the loop much simpler. I like the additional do while loop to make the calling ruby code simpler. I imagine that even remains backwards compatible with the previous behavior... I made one minor formatting change, will spin a new release with this code. Maybe igrigorik could run his tests again to see if this improves performance?
Yes you are right. I was experimenting first with the timeout. But the real change was the additional do while loop and making the rb_curl_multi_run not to read on every all. But regarding the performance, it might be better if the libcurl timeout can be introduced back. On Max OS X i saw 3 to 4 fold improvement but it might be because curl on my system did not supported the libcurl timeout.
Hmm, nope :-(
@igrigorik: is there any improvement? from the charts it seems the cpu usage has gone low and throughtput is slightly better than before. are you using libcurl version less than 7.19.x? @taf2: libcurl timeout could make the difference. was it there in 0.4.6.0?
curl 7.19.4 (x86_64-unknown-linux-gnu) libcurl/7.19.4 OpenSSL/0.9.8k zlib/1.2.3 c-ares/1.6.0 Protocols: tftp ftp telnet dict http file https ftps Features: AsynchDNS Largefile NTLM SSL libz
The new gem is the valley.. the download speed drops to about half.
igrigorik, I created this script to identify some perf issues: http://gist.github.com/446279 On my mac, I identified that openssl is causing creation of Curl::Easy handles to take a very long time, the offending call is in curl_easy_init - running that script will identify the issue... I don't think you have that issue - but maybe see?
Okay, still sort of shooting in the dark, but this commit has had a slight improvement in test run time for me... http://github.com/taf2/curb/commit/620ca1669ad41c3ad9e27434788e1c84740655f7 shaved a second off...
I added some benchmarks based on the persistent connection tests brought to attention by drbrain - i should add em-http to that as well?
interesting thing about the chart from the new version is that the CPU utilization decreases. I will have close have close look at 0.4.6 code. My only guess is that Curl::Easy instance in newer code uses Curl::Multi Could that make the difference? @igrigorik are you using Curl::Multi or Curl::Easy?
@taf2: please look at the difference of multi perfom function in both versions http://gist.github.com/446415 imho, igrigorik you might be adding more to curl multi using a block with perform call. block call is called in every iteration in the old version and in newer version only if the rb_thread_select returns 0. so it might be that more easy handles are being added on the fly to same multi handler by igrigorik's script and that i can relate with the CPU usage difference. Can this be the source of performance slowdown?
Hey guys, sorry didn't get a chance to test this yet.. but I like your last theory, makes sense, conceptually.
igrigorik, if this makes sense conceptually then it might be that you need to alter the implementation of the code somehow to make it better with newer version, which is more efficient ( Ideally the block call should be made only if the select returns 0 ). how many urls are you adding to Curl::Multi in a go? Can you double the number and see if the performance with the newer version is better or worse? Also could you add new urls with on_complete callbacks?
Let's try again with this change maybe? I remember removing the yield block believing it made sense to focus the loop more on immediate network activity - but rb_thread_select in favor of code in the perform block might be the wrong decision... if there is code in the perform block it should probably run in favor of other ruby threads... Here's the commit: http://github.com/taf2/curb/commit/ac07429c7a043f43b2e058a4b5be9f9559eaac42
Awesome. Ram, your theory is right on the money. With the latest patch the perf is definitely back into the same +-/10% range. I also converted the code to queue up the next easy handle in the on_success handler instead of the perform, and that did seem to help a bit as well. Everything combined, we're back on the same throughput numbers. Incidentally, the CPU usage has remained the same, no changes.
I think we can finally close this guy! Although.. I do wonder how the docs can/should be updated to reflect this learning, as I'm sure others may run into the same issue (even worse, without even knowing it).
Alright, I updated the README files. Closing this one out. I'll work on a new release soon.
cheers, finally we got this thing right. igrigorik if only on_success callback is use to add more to the queue then it should be low on cpu. ( when one job finishes add another ). Otherwise, with the latest patch you can yield aggressively by providing block to the perform call ( i think this should be high on cpu ).
Running 0.4.6.0, upgrading to 0.6.x shows almost a 2x throughput drop - all else being equal. Lower cpu usage, lower network throughput.
ex: http://skitch.com/igrigorik/nuyjw/network-report