Closed desb42 closed 5 years ago
Hey, sorry for the delay on my side. Getting close to a launch at work, and been working late
Unfortunately, I wasn't able to reproduce this. See my server log below.
My only guess is that you might have an older version of luaj_xowa.jar somehow? When you get a chance, try the following:
20190506_033419.120 page.async: url=en.wikipedia.org/wiki/Portal:Arts
20190506_033419.314 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Rembrandt_van_Rijn_-_Self-Portrait_-_Google_Art_Project.jpg' trg='mem/download.tmp'
20190506_033419.330 file.get: file=Rembrandt_van_Rijn_-_Self-Portrait_-_Google_Art_Project.jpg width=120 page=Portal:Arts
20190506_033419.425 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/b/bd/Rembrandt_van_Rijn_-_Self-Portrait_-_Google_Art_Project.jpg/120px-Rembrandt_van_Rijn_-_Self-Portrait_-_Google_Art_Project.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\b\d\c\c\Rembrandt_van_Rijn_-_Self-Portrait_-_Google_Art_Project.jpg\120px.jpg'
20190506_033419.555 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Las_Meninas%2C_by_Diego_Vel%C3%A1zquez%2C_from_Prado_in_Google_Earth.jpg' trg='mem/download.tmp'
20190506_033419.569 file.get: file=Las_Meninas,_by_Diego_Velázquez,_from_Prado_in_Google_Earth.jpg width=120 page=Portal:Arts
20190506_033419.594 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/Las_Meninas%2C_by_Diego_Vel%C3%A1zquez%2C_from_Prado_in_Google_Earth.jpg/120px-Las_Meninas%2C_by_Diego_Vel%C3%A1zquez%2C_from_Prado_in_Google_Earth.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\3\1\6\2\Las_Meninas,_by_Diego_Velázquez,_from_Prado_in_Google_Earth.jpg\120px.jpg'
20190506_033419.720 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Louis-Marie_Autissier%2C_Self-portrait_edit.jpg' trg='mem/download.tmp'
20190506_033419.734 file.get: file=Louis-Marie_Autissier,_Self-portrait_edit.jpg width=200 page=Portal:Arts
20190506_033419.760 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/7/7a/Louis-Marie_Autissier%2C_Self-portrait_edit.jpg/200px-Louis-Marie_Autissier%2C_Self-portrait_edit.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\7\a\6\0\Louis-Marie_Autissier,_Self-portrait_edit.jpg\200px.jpg'
20190506_033419.901 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Slonimski_Chaim_Zelig.jpg' trg='mem/download.tmp'
20190506_033419.916 file.get: file=Slonimski_Chaim_Zelig.jpg width=100 page=Portal:Arts
20190506_033420.010 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/8/8d/Slonimski_Chaim_Zelig.jpg/80px-Slonimski_Chaim_Zelig.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\8\d\c\b\Slonimski_Chaim_Zelig.jpg\80px.jpg'
20190506_033420.169 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Kane_Selfportrait.jpg' trg='mem/download.tmp'
20190506_033420.182 file.get: file=Kane_Selfportrait.jpg width=120 page=Portal:Arts
20190506_033420.205 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/6/67/Kane_Selfportrait.jpg/120px-Kane_Selfportrait.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\6\7\3\2\Kane_Selfportrait.jpg\120px.jpg'
20190506_033420.261 file.get: file=U.S._Army_Band_-_A_la_Nanita_Nana_edit.ogg width=220 page=Portal:Arts
20190506_033420.340 download pass: src='https://commons.wikimedia.org/w/api.php?action=query&format=xml&prop=imageinfo&iiprop=size|url&redirects&titles=File:Sergei_Prokofiev_circa_1918_over_Chair_Bain.jpg' trg='mem/download.tmp'
20190506_033420.355 file.get: file=Sergei_Prokofiev_circa_1918_over_Chair_Bain.jpg width=120 page=Portal:Arts
20190506_033420.378 download pass: src='https://upload.wikimedia.org/wikipedia/commons/thumb/0/03/Sergei_Prokofiev_circa_1918_over_Chair_Bain.jpg/120px-Sergei_Prokofiev_circa_1918_over_Chair_Bain.jpg' trg='C:\xowa_dev\file\commons.wikimedia.org\thumb\0\3\3\3\Sergei_Prokofiev_circa_1918_over_Chair_Bain.jpg\120px.jpg'
20190506_033420.438 redlink.redlink_bgn: page=Portal:Arts total_links=200
20190506_033420.485 redlink.redlink_end: redlinks_run=0
20190506_033423.438 page.load: url=en.wikipedia.org/wiki/Special:XowaDefaultTab
20190506_033423.438 page_load: loaded wikitext; page=Special:XowaDefaultTab wikitext_len=0
20190506_033423.458 page.async: url=en.wikipedia.org/wiki/Special:XowaDefaultTab
20190506_033423.458 redlink.redlink_bgn: page=Special:XowaDefaultTab total_links=0
20190506_033423.458 redlink.redlink_end: redlinks_run=0
I have just run another xowa_get_and_make I note that the timestamp on the bin directory is 28/04/2019 23:36 (dd/mm/yyyy) The luaj_xowa.jar file does contain Match_state.class I copy the jar file, rename it as a .zip and use file explorer to look inside the zip I once again get the error The session log is session2.zip
Note that I get a lot of File:Blank.png entries (I did not see this in your log)
Thanks for the screenshot. I see my mistake. I actually updated my Windows version to be 2019-05. I think my Linux / build version is 2019-03. Let me pull them over tomorrow and see what the problem is.
Thanks.
I also note that mediawiki have changed the portal pages on enwiki They do not seem to take any where near as much CPU time as when last I looked (#424 )
Yeah, these are much quicker in 2019-05 (as I inadvertently discovered above)
It turns out the problem is caused by a missing article from the dump. This is similar to #367.
Specifically, the following wikitext was causing the error:
{{Transclude selected recent additions | %sactor%s | %sart%s | %sarts%s | %scomic | %smuseum%s | %spainting%s | %ssculpture%s | months=12 | header={{Box-header colour|Did you know... }}|max=12}}
This was caused by en.wikipedia.org/wiki/Module:Selected_recent_additions and the following lines
local title = mw.title.new('Wikipedia:Recent additions' .. subpage)
local raw = title:getContent()
local itemPattern = '%*%s?%.%.%.[%S ]*'
local items = {}
dbg(subpage, raw, itemPattern);
for item in mw.ustring.gmatch(raw, itemPattern) do
The actual subpage was /2019/April
which generated a page of Wikipedia:Recent additions/2019/April
which didn't exist in the 2019-03 dump
Anyway, thanks for the write-up and sticking through with it above. Will mark closed in a few days unless there are other questions
It sure looks like another bot
Do you know of anyway to find out what bots are used (regularly) on wikipedia?
Do you know of anyway to find out what bots are used (regularly) on wikipedia?
Not really. I've never looked into it before. I did a quick search now, and found these pages:
As for the page in question, I'm not sure if it's bot-created. They look like they've been redirected by the same user at different times of the day
Marking this as closed as the error is related to a Portal page which doesn't exist at the time of the dump. Relevant excerpt below.
The actual subpage was /2019/April which generated a page of Wikipedia:Recent additions/2019/April which didn't exist in the 2019-03 dump
With the recent changes to regex, I thought I would take a look at a Portal page I built the latest version using xowa_get_and_make.sh Running xowa-gui The page en.wikipedia.org/wiki/Portal:Arts (2019-03-01) Gives the error
I also note that mediawiki have changed the portal pages on enwiki They do not seem to take any where near as much CPU time as when last I looked (#424 ) My own target is to go for a new download for 2019-06-01 dump (when we get there)