I'm processing a recent English Wikipedia dump and getting assign to undeclared variable errors from modules that don't have a require ('strict'); in them. Here's stripped down code to replicate this:
from wikitextprocessor import Wtp
wtp = Wtp(db_path="enwiki-20240201-wtp.db", lang_code="en", project="wikipedia")
process_dump(wtp, "enwiki-20240201-pages-articles-multistream.xml.bz2", {0, 10, 828}) # main, Template and Module namespaces
for page in wtp.get_all_pages():
if page.redirect_to != None:
continue
wtp.start_page(page.title)
wtp.expand(page.body)
break
And here's the errors I'm seeing:
Anarchism: ERROR: LUA error in #invoke('Protection banner', 'main') parent ('Template:Pp-semi-indef', {}) at ['Anarchism', 'pp-semi-indef', '#invoke', '#invoke']
[string "Module:Effective protection level"]:63: attempt to index field 'TitleBlacklist' (a nil value)
Anarchism: WARNING: invalid attribute format '20x20px\n ' missing name at ['Anarchism', 'good article', 'Main other', 'ARGVAL-1', 'Top icon', '#tag', '#tag']
Anarchism: WARNING: invalid attribute format '\n ' missing name at ['Anarchism', 'good article', 'Main other', 'ARGVAL-1', 'Top icon', '#tag', '#tag']
Anarchism: WARNING: invalid attribute format 'This is a good article. Click here for more information.]]' missing name at ['Anarchism', 'good article', 'Main other', 'ARGVAL-1', 'Top icon', '#tag', '#tag']
Anarchism: ERROR: LUA error in #invoke('sidebar', 'sidebar', ' child = yes', ' contentclass = hlist\n', ' heading1 =', ' content1 =\n<section begin=Schools of thought />\n* [[Anarcha-feminism|Feminist]]\n* [[Green anarchism|Green]]\n** [[Anarcho-primitivism|Primitivist]]\n** [[Social ecology (Bookchin)|Social ecology]]\n** [[Total liberation]]\n* [[Individualist anarchism|Individualist]]\n** [[Egoist anarchism|Egoist]]\n** [[Market anarchism|Free-market]]\n** [[Anarcho-naturism|Naturist]]\n** [[Philosophical anarchism|Philosophical]]\n* [[Mutualism (economic theory)|Mutualism]]\n* [[Postcolonial anarchism|Postcolonial]]\n** [[African anarchism|African]]\n** [[Black anarchism|Black]]\n* [[Queer anarchism|Queer]]\n* [[Anarchism and religion|Religious]]\n** [[Christian anarchism|Christian]]\n** [[Jewish anarchism|Jewish]]\n* [[Social anarchism|Social]]\n** [[Collectivist anarchism|Collectivist]]\n*** [[Parecon]]\n** [[Anarcho-communism|Communist]]\n*** [[Magonism]]\n* [[Anarchism without adjectives|Without adjectives]]\n<section end=Schools of thought />\n', ' heading5 = Methodology', ' content5 =\n<section begin=Methodology />\n* [[Agorism]]\n* [[Illegalism]]\n* [[Insurrectionary anarchism|Insurrectionary]]\n** [[Communization]]\n** [[Expropriative anarchism|Expropriative]]\n* [[Anarcho-pacifism|Pacifist]]\n* [[Platformism]]\n** [[Especifismo]]\n* [[Relationship anarchy|Relationship]]\n* [[Anarcho-syndicalism|Syndicalist]]\n* [[Synthesis anarchism|Synthesis]]\n<section end=Methodology />') parent ('Template:Anarchism sidebar', {}) at ['Anarchism', 'anarchism sidebar', '#invoke', '#invoke', 'Lua:sidebar:collapsible()', 'frame:preprocess()', '#invoke', '#invoke']
Traceback (most recent call last):
File "path-to-site-packages/wikitextprocessor/luaexec.py", line 684, in call_lua_sandbox
ret: tuple[bool, str] = ctx.lua_invoke(
^^^^^^^^^^^^^^^
File "lupa/lua51.pyx", line 869, in lupa.lua51._LuaObject.__call__
File "lupa/lua51.pyx", line 1835, in lupa.lua51.call_lua
File "lupa/lua51.pyx", line 1861, in lupa.lua51.execute_lua_call
File "lupa/lua51.pyx", line 1743, in lupa.lua51.raise_lua_error
lupa.lua51.LuaError: [string "<python>"]:36: assign to undeclared variable 'string'
stack traceback:
[C]: in function 'error'
[string "strict"]:21: in function <[string "strict"]:19>
[string "<python>"]:36: in function <[string "<python>"]:19>
(tail call): ?
[string "_sandbox_phase2"]:142: in function <[string "_sandbox_phase2"]:121>
[C]: in function 'preprocess'
[string "_sandbox_phase2"]:23: in function <[string "_sandbox_phase2"]:11>
[string "_sandbox_phase2"]:36: in function <[string "_sandbox_phase2"]:32>
(tail call): ?
[string "Module:Arguments"]:207: in function 'mergeArgs'
[string "Module:Arguments"]:320: in function <[string "Module:Arguments"]:317>
(tail call): ?
[string "sidebar"]:412: in function <[string "sidebar"]:397>
[C]: in function 'pcall'
[string "_sandbox_phase2"]:172: in function <[string "_sandbox_phase2"]:121>
Anarchism: ERROR: LUA error in #invoke('list', 'horizontal') parent ('Template:Hlist', {1: '[[Global governance|Global]]', 2: '[[Local government|Local]]'}) at ['Anarchism', 'basic forms of government', 'Politics series sidebar', 'ARGVAL-list2', '#invoke', '#invoke', 'Lua:sidebar:sidebar()', 'frame:preprocess()', 'hlist', '#invoke', '#invoke']
Traceback (most recent call last):
File "path-to-site-packages/wikitextprocessor/luaexec.py", line 684, in call_lua_sandbox
ret: tuple[bool, str] = ctx.lua_invoke(
^^^^^^^^^^^^^^^
File "lupa/lua51.pyx", line 869, in lupa.lua51._LuaObject.__call__
File "lupa/lua51.pyx", line 1835, in lupa.lua51.call_lua
File "lupa/lua51.pyx", line 1861, in lupa.lua51.execute_lua_call
File "lupa/lua51.pyx", line 1743, in lupa.lua51.raise_lua_error
lupa.lua51.LuaError: [string "<python>"]:36: assign to undeclared variable 'string'
stack traceback:
[C]: in function 'error'
[string "strict"]:21: in function <[string "strict"]:19>
[string "<python>"]:36: in function <[string "<python>"]:19>
(tail call): ?
[string "_sandbox_phase2"]:142: in function <[string "_sandbox_phase2"]:121>
[C]: in function 'preprocess'
[string "_sandbox_phase2"]:23: in function <[string "_sandbox_phase2"]:11>
[string "_sandbox_phase2"]:36: in function <[string "_sandbox_phase2"]:32>
(tail call): ?
[string "Module:Arguments"]:207: in function 'mergeArgs'
[string "Module:Arguments"]:320: in function <[string "Module:Arguments"]:317>
(tail call): ?
[string "sidebar"]:122: in function 'move_hiding_templatestyles'
[string "sidebar"]:140: in function <[string "sidebar"]:136>
[C]: in function 'pcall'
[string "_sandbox_phase2"]:172: in function <[string "_sandbox_phase2"]:121>
Anarchism: ERROR: LUA error in #invoke('lang', 'lang_xx_italic', 'code=fr') parent ('Template:Lang-fr', {1: 'anarchiste'}) at ['Anarchism', 'lang-fr', '#invoke', '#invoke']
Loading module failed in #invoke: lang
[string "Module:Lang/data"]:647: variable 'special_tags_table' is not declared
Anarchism: ERROR: LUA error in #invoke('Lang', 'lang') parent ('Template:Lang', {1: 'fr', 2: '[[sans-culottes]]'}) at ['Anarchism', 'lang', '#invoke', '#invoke']
Loading module failed in #invoke: Lang
[string "Module:Lang/data"]:647: variable 'special_tags_table' is not declared
Anarchism: ERROR: LUA error in #invoke('citation/CS1', 'citation', 'CitationClass=book') parent ('Template:Cite book', {'title': 'The Desk Encyclopedia of World History', 'publisher': '[[Oxford University Press]]', 'year': '2006', 'isbn': '978-0-7394-7809-7', 'editor-last': 'Wright', 'editor-first': 'Edmund', 'location': 'New York', 'pages': '20–21'}) at ['Anarchism', 'Cite book', '#invoke', '#invoke']
[string "Module:Citation/CS1/Configuration"]:33: assign to undeclared variable 'uncategorized_namespaces_t'
Anarchism: ERROR: LUA error in #invoke('citation/CS1', 'citation', 'CitationClass=citation') parent ('Template:Citation', {'last': 'Fiala', 'first': 'Andrew', 'title': 'Anarchism', 'date': '2021', 'url': 'https://plato.stanford.edu/archives/win2021/entries/anarchism/', 'encyclopedia': 'The Stanford Encyclopedia of Philosophy', 'editor-last': 'Zalta', 'editor-first': 'Edward N.', 'access-date': '2023-06-17', 'edition': 'Winter 2021', 'publisher': 'Metaphysics Research Lab, Stanford University'}) at ['Anarchism', 'Citation', '#invoke', '#invoke']
[string "Module:Citation/CS1/Configuration"]:33: assign to undeclared variable 'uncategorized_namespaces_t'
Anarchism: ERROR: LUA error in #invoke('citation/CS1', 'citation', 'CitationClass=book') parent ('Template:Cite book', {'last': 'Bakunin', 'first': 'Mikhail', 'author-link': 'Mikhail Bakunin', 'title': 'Statism and Anarchy', 'title-link': 'Statism and Anarchy', 'year': '1990', 'orig-year': '1873', 'publisher': '[[Cambridge University Press]]', 'location': 'Cambridge, England', 'series': 'Cambridge Texts in the History of Political Thought', 'translator-last': 'Shatz', 'translator-first': 'Marshall', 'isbn': '978-0-521-36182-8', 'oclc': '20826465', 'lccn': '89077393', 'doi': '10.1017/CBO9781139168083', 'editor1-last': 'Shatz', 'editor1-first': 'Marshall'}) at ['Anarchism', 'cite book', '#invoke', '#invoke']
[string "Module:Citation/CS1/Configuration"]:33: assign to undeclared variable 'uncategorized_namespaces_t'
...
Snip, many repeats.
...
Anarchism: ERROR: LUA error in #invoke('citation/CS1', 'citation', 'CitationClass=book') parent ('Template:Cite book', {'last1': 'Levy', 'first1': 'Carl', 'last2': 'Adams', 'first2': 'Matthew S.', 'title': 'The Palgrave Handbook of Anarchism', 'date': '2019', 'publisher': '[[Palgrave Macmillan]]', 'doi': '10.1007/978-3-319-75620-2', 'isbn': '978-3-319-75620-2', 's2cid': '149333615', 'url': 'https://link.springer.com/book/10.1007/978-3-319-75620-2', 'language': 'en'}) at ['Anarchism', 'cite book', '#invoke', '#invoke']
[string "Module:Citation/CS1/Configuration"]:33: assign to undeclared variable 'uncategorized_namespaces_t'
Anarchism: ERROR: LUA error in #invoke('If empty', 'main') parent ('Template:If empty', {1: '', 2: '[[List of anarchist communities|Anarchist-related territories and autonomous zones]]'}) at ['Anarchism', 'anarchies', '#invoke', '#invoke', 'Lua:navbox:navbox()', 'frame:preprocess()', 'if empty', '#invoke', '#invoke']
Traceback (most recent call last):
File "path-to-site-packages/wikitextprocessor/luaexec.py", line 684, in call_lua_sandbox
ret: tuple[bool, str] = ctx.lua_invoke(
^^^^^^^^^^^^^^^
File "lupa/lua51.pyx", line 869, in lupa.lua51._LuaObject.__call__
File "lupa/lua51.pyx", line 1835, in lupa.lua51.call_lua
File "lupa/lua51.pyx", line 1861, in lupa.lua51.execute_lua_call
File "lupa/lua51.pyx", line 1743, in lupa.lua51.raise_lua_error
lupa.lua51.LuaError: [string "<python>"]:36: assign to undeclared variable 'string'
stack traceback:
[C]: in function 'error'
[string "strict"]:21: in function <[string "strict"]:19>
[string "<python>"]:36: in function <[string "<python>"]:19>
(tail call): ?
[string "_sandbox_phase2"]:142: in function <[string "_sandbox_phase2"]:121>
[C]: in function 'preprocess'
[string "_sandbox_phase2"]:23: in function <[string "_sandbox_phase2"]:11>
[string "Module:Arguments"]:254: in function <[string "Module:Arguments"]:232>
[string "navbox"]:552: in function <[string "navbox"]:543>
[C]: in function 'pcall'
[string "_sandbox_phase2"]:172: in function <[string "_sandbox_phase2"]:121>
I looked into one of these, namely the one for Citation/CS1/Configuration, since there were many of them. The errors are like:
Anarchism: ERROR: LUA error in #invoke('citation/CS1', 'citation', 'CitationClass=book') parent ('Template:Cite book', {'title': 'The Desk Encyclopedia of World History', 'publisher': '[[Oxford University Press]]', 'year': '2006', 'isbn': '978-0-7394-7809-7', 'editor-last': 'Wright', 'editor-first': 'Edmund', 'location': 'New York', 'pages': '20–21'}) at ['Anarchism', 'Cite book', '#invoke', '#invoke']
[string "Module:Citation/CS1/Configuration"]:33: assign to undeclared variable 'uncategorized_namespaces_t'
I'm processing a recent English Wikipedia dump and getting
assign to undeclared variable
errors from modules that don't have arequire ('strict');
in them. Here's stripped down code to replicate this:And here's the errors I'm seeing:
I looked into one of these, namely the one for Citation/CS1/Configuration, since there were many of them. The errors are like:
My guess is the problem comes from a
require ('strict');
appearing in the importing module, https://en.wikipedia.org/wiki/Module:Citation/CS1, which loads https://en.wikipedia.org/wiki/Module:Citation/CS1/Configuration with:It seems the sandbox re-implements loadData here https://github.com/tatuylonen/wikitextprocessor/blob/main/src/wikitextprocessor/lua/_sandbox_phase1.lua#L129, which calls into
new_loader
. My understanding of Lua is limited, but it seems likenew_loader
might not implement the same logic for loading the module in a new env asexecuteModule
in https://github.com/wikimedia/mediawiki-extensions-Scribunto/blob/8d69dc173e33ae936ff4401d41ee5e6a1fd1ba67/includes/Engines/LuaCommon/lualib/mw.lua#L467 does.