jpatokal / mediawiki-gateway

Ruby framework for MediaWiki API manipulation
Other
133 stars 50 forks source link

uses uncachable POST instead of cachable GET #24

Closed cariaso closed 12 years ago

cariaso commented 13 years ago

bots using this library hit my site much harder than the perl or python equivalents, due to the use of a POST for requests that could be done with a GET.

jpatokal commented 13 years ago

Interesting. I was under the impression that POST was required, but looking at the current documentation that doesn't seem to be the case, although many request types (login, edit, etc) do require POST. Will try to find the list and switch at least query operations to GET.

cariaso commented 13 years ago

Thanks for looking into it.

my site is http://www.snpedia.com and we get a LOT bot traffic. Varnish cashing saves the day, but can only help with GETs not POSTs.

http://pywikipediabot.sourceforge.net/ does GETs naturally.

http://search.cpan.org/~lifeguard/MediaWiki-Bot-v3.4.2/lib/MediaWiki/Bot.pm and http://search.cpan.org/~exobuzz/MediaWiki-API-0.27/lib/MediaWiki/API.pm have both been changed to use GET where possible.

I've started to see incoming from your library, and figure the sooner this change gets out there, the sooner many mediawiki sites will benefit.

arjes commented 13 years ago

I assume get requests are used where they logically should, for information retrieval. And posts for login / editing.

I will see if I can make the necessary changes this week, but if you have a list it would make it a lot easier.

cariaso commented 13 years ago

Jools Wills seems to be well informed on this issue, as he did the fix for the perl libs http://groups.google.com/group/perlwikibot/browse_thread/thread/3e0ea16509c57661/ec1387d33d3ccbff his contact info is accessible from http://groups.google.com/groups/profile?enc_user=j95MUhMAAABtmWaKC64Pn-5PeJoqSokKhpCwAXzNFd5eTDaGmvSpDA

jpatokal commented 13 years ago

Simply using get for all action=query requests and post for everything else should solve 95% of the issue. I'm traveling at the moment, but will try to take a crack at this next week if you don't beat me to it.

arjes commented 13 years ago

I was working on it last night in bed but couldn't get the specs to run (at all). I had to re-compile 1.9.2 but I will try again tonight.

I was planning on snooping on the action as well.

I don't see a open-ended method in RestClient however where the type can be specified. http://rdoc.info/github/archiloque/rest-client/master/RestClient

the get and post requests are just wrappers for Request.execute(:method => method, :url => url, :headers => headers, &block) so it may be easier to just use that instead of RestClient.get/post

Just sharing what I have looked at. Again I will try again tonight if I can the specs running.

cariaso commented 12 years ago

Sadly, this issue remains unresolved. If anyone does address it, it might be wise to consider a related issue illustrated in https://rt.cpan.org/Public/Bug/Display.html?id=75296

Semantic-Mediawiki now exposes 2 new actions in api.php named 'ask' and 'askargs'. It would be good if they could also be accessed via GETs instead of POSTs, and perhaps the fix for this issue could anticipate that.

jpatokal commented 12 years ago

It's been a long time coming, but this should finally be fixed in 0.5.0: all action=query requests now use GET.

Re: those new Semantic Mediawiki actions, they're not supported by mediawiki-gateway at the moment, so the flavor of request to use is a bit of a moot issue.