Help! DO dump files contain the wikitable in the wikipedia?

I think I've got the answer myself. The dump files actually contain the wikitable information but in a different way. Adding the argument --html may help get the wikitable more directly. But the code seems to have bugs when converting xml to html. It reports KeyError as follows:

File "/storage/miniconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/storage/miniconda3/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/storage/fbzhu/yc/wikiextractor/wikiextractor/WikiExtractor.py", line 467, in extract_process
    Extractor(*job[:-1]).extract(out, html_safe)  # (id, urlbase, title, page)
  File "/storage/wikiextractor/wikiextractor/extract.py", line 857, in extract
    text = self.clean_text(text, html_safe=html_safe)
  File "/storage/wikiextractor/wikiextractor/extract.py", line 847, in clean_text
    text = compact(text, mark_headers=mark_headers)
  File "/storage/wikiextractor/wikiextractor/extract.py", line 256, in compact
    page.append(listItem[n] % line)
KeyError: '&'

I am using the xml files dumped at 20 Feb 2021 and wikiextractor version 3.0.5.

attardi / wikiextractor

Help! DO dump files contain the wikitable in the wikipedia? #247