WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.
https://github.com/WikiTeam
GNU General Public License v3.0
705 stars 147 forks source link

Match single quotes too when scraping namespaces #448

Closed yzqzss closed 1 year ago

yzqzss commented 1 year ago

MediaWiki 1.36 HTML uses single quotes, as seen in https://wiki.othing.xyz

See also: https://github.com/elsiehupp/wikiteam3/pull/54

nemobis commented 1 year ago

I'm not sure if the same problem exists in wikiteam/wikiteam. This fix is simply updating the regex and should be generic.

Thanks.

The new versions of MediaWiki use ', older use ".

But why are people still using screenscraping for newer versions? :(