Closed mamarjan closed 10 years ago
Worth trying - especially with authentication included.
After a couple of hours trying, all I got was something like a two minute saving - 18 minutes instead of 20. I also had to move to Ruby 2.0.0 for that. I believe the real solution would be to refactor the code to fetch data in parallel, using something like this:
Maybe. It appears to me it that the rubygems and github APIs are just slow, most likely on purpose. If they are throttling, parallel won't help. I don't think our code is terribly slow, but only a profiler can show that.
I think the way forward is to cache all items. For example stargazers need only be updated every other day. Issues daily - we don't have to update everything every time. I'll probably implement that next time round.
I don't think you need a profiler for that. See this output from the time command, while running "./create_website.sh";
real 20m5.076s user 1m45.683s sys 0m32.478s
I guess that means that the actual processing takes a bit over 2 minutes, and the rest is spent waiting for answers to HTTP get requests.
Both Ruby code and http waiting happens in user. I presume it is http, but there is only one way to be sure :)
From what I can tell, the methods in http.rb open a new connection each time they're used. That means for example that for each of the 100+ requests to GitHub API, a new connection is made.
Using some kind of cache (probably just a hash of open connections per method (http/https), with the host names being keys), should speed up site generation considerably.