itkach / mw2slob

A tool to convert MediaWiki content to dictionaries in slob format
GNU General Public License v3.0
19 stars 4 forks source link

TOCs with enterprise dumps #10

Closed rtega closed 2 years ago

rtega commented 2 years ago

Would it be possible to generate TOCs in the articles from the enterprise dumps? They are not included in these dumps. Or should the dumps themselves be fixed?

itkach commented 2 years ago

table of content is generated as of 38f166fc28146555cd13fc9a4cdd3c3b33317c24

rtega commented 2 years ago

I upgraded mw2slob with

 pip install --upgrade git+https://github.com/itkach/mw2slob.git                                                                 ──(Sat,May14)─┘
Collecting git+https://github.com/itkach/mw2slob.git
  Cloning https://github.com/itkach/mw2slob.git to /tmp/pip-req-build-dxe4kagf
  Running command git clone -q https://github.com/itkach/mw2slob.git /tmp/pip-req-build-dxe4kagf
Requirement already satisfied, skipping upgrade: CouchDB in ./.local/lib/python3.8/site-packages (from mw2slob==1.0) (1.2)
Requirement already satisfied, skipping upgrade: Slob>=1.0 in ./env-slob/lib/python3.8/site-packages (from mw2slob==1.0) (1.0.2)
Requirement already satisfied, skipping upgrade: bs4 in ./env-slob/lib/python3.8/site-packages (from mw2slob==1.0) (0.0.1)
Requirement already satisfied, skipping upgrade: cssselect in /usr/lib/python3/dist-packages (from mw2slob==1.0) (1.1.0)
Requirement already satisfied, skipping upgrade: cssutils in /usr/lib/python3/dist-packages (from mw2slob==1.0) (1.0.2)
Requirement already satisfied, skipping upgrade: lxml in /usr/lib/python3/dist-packages (from mw2slob==1.0) (4.5.0)
Requirement already satisfied, skipping upgrade: PyICU>=1.5 in /usr/lib/python3/dist-packages (from Slob>=1.0->mw2slob==1.0) (2.4.2)
Requirement already satisfied, skipping upgrade: beautifulsoup4 in /usr/lib/python3/dist-packages (from bs4->mw2slob==1.0) (4.8.2)
Building wheels for collected packages: mw2slob
  Building wheel for mw2slob (setup.py) ... done
  Created wheel for mw2slob: filename=mw2slob-1.0-py3-none-any.whl size=1070430 sha256=f75a85275edebc137261e2534c2b8b1874c909e1a3d5f7325e83a51c12b9941b
  Stored in directory: /tmp/pip-ephem-wheel-cache-wajp1g5y/wheels/a7/c0/c7/14d7081dd8cd292d2e81763983c3c5c42a58987f511a94a707
Successfully built mw2slob
Installing collected packages: mw2slob
  Attempting uninstall: mw2slob
    Found existing installation: mw2slob 1.0
    Uninstalling mw2slob-1.0:
      Successfully uninstalled mw2slob-1.0
Successfully installed mw2slob-1.0

and generated the slob with

mw2slob dump --siteinfo vls.si.json ./vlswiki-NS0-20220420-ENTERPRISE-HTML.json.tar.gz -f toc common wiki

I'm not getting TOCs. Am I missing something?

itkach commented 2 years ago

Maybe you are getting it, just didn't notice :) TOC is collapsed by default, tap the arrow before title (or any part of the title area that is not a link).

Screenshot_20220503-085336_Aard 2

rtega commented 2 years ago

I was missing something then. Thanks a lot for the clarification!