Bot: check user permissions

kynikos commented 11 years ago

The bot should show a proper warning when attempting to edit a protected article without having the rights to do so (MediaWiki will return a proper error), and, most importantly, it should bypass it and continue processing the list of articles.

kynikos commented 11 years ago

See also #139

lahwaacz commented 10 years ago

Here is my unsuccessful attempt to make a working solution (did not make a pull request because it does not work, but the patch should be complete):

diff --git a/src/modules/Bot.js b/src/modules/Bot.js
index 903ba35..06dc383 100644
--- a/src/modules/Bot.js
+++ b/src/modules/Bot.js
@@ -303,8 +303,9 @@ WM.Bot = new function () {
         if (link.className.split(" ").indexOf("new") < 0) {
             var title = link.title;
             var duplicates = document.getElementById('WikiMonkeyBotDuplicates').checked;
+            var protection = WM.MW.isProtectedPage(title);

-            if (duplicates || WM.Bot.selections.visited.indexOf(title) == -1) {
+            if (protection == false && (duplicates || WM.Bot.selections.visited.indexOf(title) == -1)) {
                 WM.Bot.selections.visited.push(title);
                 var inverse = document.getElementById('WikiMonkeyBotInverse').checked;

diff --git a/src/modules/MW.js b/src/modules/MW.js
index 98c8e4f..7438145 100644
--- a/src/modules/MW.js
+++ b/src/modules/MW.js
@@ -358,6 +358,38 @@ WM.MW = new function () {
         return false;
     };

+    this.isProtectedPage = function (title) {
+        var groups = userInfo.query.userinfo.groups;
+        var protection = false;
+        var query = {prop: "info",
+                     inprop: "protection",
+                     titles: title};
+        var call = function (res, arg) {
+//            WM.Log.logInfo("prot_list type: " + typeof prot_list);
+//            WM.Log.logInfo("keys: " + Alib.Obj.getKeys(res.protection));
+//            WM.Log.logInfo("values: " + Alib.Obj.getValues(res.protection));
+            var protections = Alib.Obj.getValues(res.protection);
+            for (var prot in protections) {
+                prot = protections[prot]
+//                WM.Log.logInfo("type: " + prot.type);
+//                WM.Log.logInfo("level: " + prot.level);
+                if (prot.type == "edit" && groups.indexOf(prot.level) < 0) {
+                    WM.Log.logInfo("TRUE");
+                    protection = true;
+                    break;
+                }
+            }
+        }
+//        WM.Log.logInfo("Querying protection level for page '" + title + "'");
+//        WM.Log.logInfo("query string: '?format=json" + joinParams(query) + "'");
+        this.callQuery(query,
+                       call,
+                       null);
+//        WM.Log.logInfo("groups: " + groups);
+//        WM.Log.logInfo("protection: " + protection);
+        return protection
+    }
+
     this.getBacklinks = function (bltitle, blnamespace, call, callArgs) {
         var query = {action: "query",
                      list: "backlinks",

The problem is probably simple, but my knowledge of javascript is fairly limited (plus it has been a long day): the variable protection stays false...

I don't really understand the black magic around callAPIGet, callQuery and their callbacks, I spent most of the time trying to understand this :) so I will appreciate any pointers. Actually, TRUE is printed always the last, is the callback function called asynchronously or is it just the logging?

I hope this is at least helpful, it's currently the most annoying bug for me...

kynikos commented 10 years ago

It's fixed in 1.14.5, but with a try-catch approach, thus avoiding to add an extra query per processed page. Thank you so much for trying to fix it anyway, you've even gone to look into the Alib library! Actually many of the algorithms in WM must be designed asynchronously: every time you use a function that has to query the server you cannot return a value directly, but you have to pass the querying function a callback function, which will be called once the server has answered; the values you'd like to "return" will be passed as arguments for the callback function. This is for example why you won't see many loops in WM's code, as they must all be turned into recursions :) A simple example is in https://github.com/kynikos/wiki-monkey/blob/master/src/plugins/SimpleReplace.js#L128

    this.mainAuto = function (args, title, callBot, chainArgs) {
        var id = args[0];

        WM.MW.callQueryEdit(title,
                            WM.Plugins.SimpleReplace.mainAutoWrite,
                            [id, callBot]);
    };

    this.mainAutoWrite = function (title, source, timestamp, edittoken, args) {

WM.MW.callQueryEdit accepts (title, call, callArgs) and the callback function WM.Plugins.SimpleReplace.mainAutoWrite must match the arguments with which WM.MW.callQueryEdit calls the callback function, i.e. (title, source, timestamp, edittoken, args). In synchronous programming, you may see all this as:

    this.mainAuto = function (args, title, callBot, chainArgs) {
        var id = args[0];
        var title, source, timestamp, edittoken;

        // WM.MW.callQueryEdit would need only 'title', and return all values except 'args'
        // (This variable assignment method is supported only by Firefox, IIRC)
        [title, source, timestamp, edittoken] = WM.MW.callQueryEdit(title);
        WM.Plugins.SimpleReplace.mainAutoWrite(title, source, timestamp, edittoken, [id, callBot]);
    };

    this.mainAutoWrite = function (title, source, timestamp, edittoken, args) {

Finally, note that WM.MW.callQueryEdit doesn't call the passed callback function immediately, but it calls WM.MW.callQuery, which in turn calls WM.MW.callAPIGet, which in turn does the actual request, which eventually triggers the chain of all the callback functions. I know, it looks very convoluted at first, that's why programming asynchronously is avoided when possible ;) Don't hesitate to ask more questions (here or somewhere else), because if you'll want to try to fix some bugs in the future I'll be very glad to accept patches and pull requests!

lahwaacz commented 10 years ago

I see, with my approach it would be necessary to block the execution of the bot and pass some unblock function to the callback. The try-catch approach is much more elegant :)

Thanks for the lecture (and fixing the bug of course), sometimes I'll open a new thread for further questions. Now I need to figure out how to best debug userscripts, relying on the output in Wiki Monkey's log window is not a good idea...

kynikos commented 10 years ago

Um... the block/unblock observation doesn't make much sense ^^ All WM's API functions for querying the server are asynchronous, so they'll never "return" anything, and they'll never block the browser. You can indeed make synchronous requests, and they do freeze the browser until the server has answered, which is something you don't want. You could have made an asynchronous isProtectedPage function, but you couldn't have used it like var protection = WM.MW.isProtectedPage(title);, you should have split the function that calls it into two functions: the first part would end with something like WM.MW.isProtectedPage(title, call, callArgs); and the second function would be defined as var this.continuedProcess = function (returnedArgsFromIsProtectedPage, callArgs) {.

About debugging, I may work on a wiki page for that, however you can log debug messages to WM's log window through WM.Log.logDebug('My message'); (they are highlighted in light blue, so they're easy to distinguish from the normal messages). Then you have your browser's Javascript console for JS errors, and also for printing messages with console.log(), although I don't remember if it's supported by all browsers. If you want to introspect JSON-compatible objects, you can use JSON.stringify(). Sometimes using escape() can be useful too, and so on...

If you're working with the standalone version, you can do some changes directly there and reinstall the userscript, refreshing the page for any changes to take effect. I do the main work on the basic non-standalone version, and I use a special configuration file, https://github.com/kynikos/wiki-monkey/blob/develop/src/files/WikiMonkey-local.user.js , that reads the files locally, so once I've saved some changes I just reinstall WM, refresh the browser and see the results; no need to upload, push or commit anything.

lahwaacz commented 10 years ago

Hmm, splitting the function would mean practically rewriting the Bot.js module - the implemented solution now seems even more elegant :)

Somehow I was not able to make the local configuration work in Chromium/Tampermonkey, though the same configuration works absolutely fine in Firefox/Greasemonkey. As a side effect I found that WM is MUCH faster when running under GM (lower CPU usage and faster queries, also with TM it takes 3-5 seconds after the page is loaded until WM interface is shown, in GM it's <1 sec), plus there are some bugs in TM that make me want to not use it. I've been grumbling at Firefox for a long time, so perhaps it's time for a change once again...

P.S.: there are too many monkeys in this post :P

kynikos commented 10 years ago

Yeah, now that I think of it, the last time I tried to make Chromium read local files I just had to give up, but that was some time ago and I didn't know if the situation had improved or not... Now you've given me an answer ^^ Thanks for giving me some benchmarking info, it's been a while since I did some comparison tests... You may have fun also comparing GM with Scriptish (I've used the latter for a couple of years now, but GM may have caught up in the meantime). In general, Firefox gives me the feeling of being a little slower than Chromium per se, but I still prefer the former because, like in this very case, I find its extensions are in general better developed than Chromium's counterparts (when suitable counterparts are available at all), and I love customizing my browser ;)

EDIT: when mentioning debugging tools above, I forgot the obvious, ever-working alert(), especially useful when you want to break the execution of your code; since you said you're not well experienced with JS I thought it was worth to mention it explicitly.

lahwaacz commented 10 years ago

Thanks for the tips, Scriptish is really more advanced (though the difference is much lesser than between TM and GM). You mentioned that you need to reinstall the userscript each time you make some change, so you might be interested in my scriptish-local-hack.sh. Now I just need to reload the page ;)

kynikos commented 10 years ago

Eheh thank you for the script! However you may add a note in the header reminding that if you add/rename some files you do have to reinstall WM and re-run the script; then you should also remark that it will leave broken links if you rename/delete some files, unless you want to add a cleanup function too. Then even if you add link_from_path "$git_root/src/files it doesn't link to the local configuration file, because Scriptish renames it further than just making it lowercase:

skipping /home/dario/Archive/Development/wiki-monkey/src/files/WikiMonkey-local.user.js

And

There are some files that were not re-linked:
wiki-monkey-local.user.js

The script can be handy, you may want to rename it to something like "scriptish-dynamic-local-cache.sh" (I prefer descriptive names) and then request to pull it as well. Personally though, since the aforementioned synchronization issues may lead to sneaky bugs, I'll probably stick to my method: when developing WM I always keep the local config file open in a browser tab, so when I want to reinstall it I just refresh its tab (instantaneous because it's a local resource), click on Reinstall (or Tab, Enter) and reload the wiki testing article:)

kynikos / wiki-monkey

Bot: check user permissions #135