jpatokal / mediawiki-gateway

Ruby framework for MediaWiki API manipulation
Other
133 stars 50 forks source link

Maxlog errors are not retried #22

Open jmstacey opened 13 years ago

jmstacey commented 13 years ago

It appears that contrary to the Wikipedia API documentation, 503 status codes are not being returned when maxlog is greater than the given threshold. Instead, a status code of 200 is returned which causes the maxlog warning to be caught in the get_response() method. This raises an APIError which terminates the execution of the code. Maxlog failures should be retried following the same procedure as for a 503 code [i.e. sleep for the retry_delay up to the retry_count].

Since this error is caught in get_response() instead of make_api_request(), the solution gets a bit tricky to handle with grace....

jpatokal commented 13 years ago

Sounds like a bug in MW if you ask me. Which version of MediaWiki is this happening in, and do you have a sample of an API response with a maxlog warning?

jmstacey commented 13 years ago

I'm noticing this against the English Wikipedia [http://en.wikipedia.org/w/api.php] site, so whatever versions that their various servers are running. Issuing the following query seems to indicate that it is "MediaWiki 1.17wmf1":

http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo

jmstacey commented 13 years ago

Here's the documentation on maxlag: http://www.mediawiki.org/wiki/Manual:Maxlag_parameter It says that a 503 status code should be returned as of version 1.10, but I'm not seeing that right now. According to that documentation we can force a refused request by setting maxlag to a negative number,

This is an example of what the response might look like in such a situation: Jon-Staceys-iMac:wikipedia_misspelling_sampler jon$ ruby sampler.rb W, [2011-09-25T10:47:29.126224 #89538] WARN -- : Maxlag over threshold: <?xml version="1.0"?><api servedby="srv298"><error code="maxlag" info="Waiting for 10.0.6.42: 0 seconds lagged" /></api>. Retry in 10 seconds. W, [2011-09-25T10:47:39.433099 #89538] WARN -- : Maxlag over threshold: <?xml version="1.0"?><api servedby="srv255"><error code="maxlag" info="Waiting for 10.0.6.42: 0 seconds lagged" /></api>. Retry in 10 seconds.

The above example is working because I moved the following block of code from get_response() up to the make_api_request() RestClient.post block (I can issue a pull request if you want once I get the code pushed back to GitHub):

    if doc.elements["error"]
      code = doc.elements["error"].attributes["code"]
      info = doc.elements["error"].attributes["info"]
      if code == 'maxlag' and retry_count < @options[:retry_count]
        log.warn("Maxlag over threshold: #{response.body}.  Retry in #{@options[:retry_delay]} seconds.")
        sleep @options[:retry_delay]
        make_api_request(form_data, continue_xpath, retry_count + 1)
      end
      raise APIError.new(code, info)
    end
    if doc.elements["warnings"]
      warning("API warning: #{doc.elements["warnings"].children.map {|e| e.text}.join(", ")}")
    end

This solution isn't particularly elegant, but it gets the job done for the time being. Before I modified the code, below is an example of what would happen when I got a maxlog error. The 200 is the output of response.code at the start of the RestClient.post block in make_api_request().

Jon-Staceys-iMac:wikipedia_misspelling_sampler jon$ ruby sampler.rb 
200
/Users/jon/Documents/School/Database Organization/Paper #3 - Wikipedia Project/Research/wikipedia_misspelling_sampler/mediawiki-gateway/lib/media_wiki/gateway.rb:671:in `get_response': API error: code 'maxlag', info 'Waiting for 10.0.6.36: 12 seconds lagged' (MediaWiki::APIError)
     from /Users/jon/Documents/School/Database Organization/Paper #3 - Wikipedia Project/Research/wikipedia_misspelling_sampler/mediawiki-gateway/lib/media_wiki/gateway.rb:640:in `block in make_api_request'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in `call'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:228:in `process_result'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:178:in `block in transmit'
    from /Users/jon/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:627:in `start'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:172:in `transmit'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:64:in `execute'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient/request.rb:33:in `execute'
    from /Users/jon/.rvm/gems/ruby-1.9.2-p180/gems/rest-client-1.6.7/lib/restclient.rb:72:in `post'```
jpatokal commented 13 years ago

Thanks! Submitted as a MediaWiki bug report, let's see what they respond. If this new behavior is intentional, it'll be time to change the gateway to suit.

https://bugzilla.wikimedia.org/show_bug.cgi?id=31156

jpatokal commented 13 years ago

So apparently returning a 200 OK is the new normal. X( Send me a pull request, and I'll merge it in.

jmstacey commented 13 years ago

Well, my fix doesn't seem to be working at the moment. The problem seems to be that we eventually return after the successful function call. I need a way of calling the method and then "forgetting" about coming back to the current stack. I'll have to sit down and look at this at some point, but thought I'd share incase someone brighter than me has a graceful solution. Perhaps better leveraging Ruby's exception facilities.