earwig / mwparserfromhell

A Python parser for MediaWiki wikicode
https://mwparserfromhell.readthedocs.io/
MIT License
758 stars 75 forks source link

Exiting due to uncaught exception <class 'MemoryError'> #253

Open alchimista opened 4 years ago

alchimista commented 4 years ago

Running category_redirect.py on toolforge, with python3, i'm getting the following traceback: File "/data/project/shared/pywikibot/stable/scripts/category_redirect.py", line 526, in main() File "/data/project/shared/pywikibot/stable/scripts/category_redirect.py", line 522, in main bot.run() File "/data/project/shared/pywikibot/stable/scripts/category_redirect.py", line 495, in run self.log_page.save(comment) File "/shared/pywikipedia/core/pywikibot/tools/init.py", line 1449, in wrapper

File "/shared/pywikipedia/core/pywikibot/tools/init.py", line 1449, in wrapper

File "/shared/pywikipedia/core/pywikibot/page/init.py", line 1299, in save if not force and not self.botMayEdit(): File "/shared/pywikipedia/core/pywikibot/page/init.py", line 1158, in botMayEdit templates = self.templatesWithParams() File "/shared/pywikipedia/core/pywikibot/tools/init.py", line 1449, in wrapper

File "/shared/pywikipedia/core/pywikibot/page/init.py", line 2360, in templatesWithParams templates = self.raw_extracted_templates File "/shared/pywikipedia/core/pywikibot/page/init.py", line 2335, in raw_extracted_templates self.text, True, True) File "/shared/pywikipedia/core/pywikibot/textlib.py", line 1657, in extract_templates_and_params return extract_templates_and_params_mwpfh(text, strip) File "/shared/pywikipedia/core/pywikibot/textlib.py", line 1675, in extract_templates_and_params_mwpfh code = mwparserfromhell.parse(text) File "/shared/pywikipedia/core/mwparserfromhell/utils.py", line 58, in parse_anything return Parser().parse(value, context, skip_style_tags) File "/shared/pywikipedia/core/mwparserfromhell/parser/init.py", line 93, in parse tokens = self._tokenizer.tokenize(text, context, skip_style_tags) File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 1459, in tokenize tokens = self._parse(context) File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 1337, in _parse self._parse_wikilink() File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 328, in _parse_wikilink link, extra, delta = self._really_parse_external_link(True) File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 453, in _really_parse_external_link self._parse_bracketed_uri_scheme() File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 387, in _parse_bracketed_uri_scheme self._fail_route() File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 162, in _fail_route self._memoize_bad_route() File "/shared/pywikipedia/core/mwparserfromhell/parser/tokenizer.py", line 153, in _memoize_bad_route self._bad_routes.add(self._stack_ident) MemoryError CRITICAL: Exiting due to uncaught exception <class 'MemoryError'>

Seems to be related (if not duplicated) of #248 and #247.

legoktm commented 4 years ago

A memory error means you've run out of memory. Try increasing the amount of memory allocated to your job on Toolforge and see if that makes a difference.

ghost commented 4 years ago

Contributions toward fixing #209 would also help with memory issues!

alchimista commented 4 years ago

@legoktm i've tried that prior to this bug, and even with 750 Mb allocated it still occasionally runs out of memory. Somehow it became memory eager.

legoktm commented 4 years ago

I'd suggest allocating more memory than that, normally I try around 2G.