earwig / mwparserfromhell

A Python parser for MediaWiki wikicode
https://mwparserfromhell.readthedocs.io/
MIT License
741 stars 74 forks source link

Version 0.6.2 doesn't parse nested wikilinks #270

Closed gamecat10 closed 2 years ago

gamecat10 commented 3 years ago

v. 0.6

import mwparserfromhell as mw
a='[[File:test|[[test]]]]'
type(mw.parse(a).nodes[0])
<class 'mwparserfromhell.nodes.wikilink.Wikilink'>

v. 0.6.2:

import mwparserfromhell as mw
a='[[File:test|[[test]]]]'
type(mw.parse(a).nodes[0])
<class 'mwparserfromhell.nodes.text.Text'>
earwig commented 3 years ago

Thanks. This is indeed a regression in 0.6.2. I'll work out a fix for this.

demostanis commented 2 years ago

bump pls im having a headache

geohci commented 2 years ago

I wanted to check in on this too -- is there a sense of where the issue is / how fixable it is? I assume it's on the C side and not Python (which limits my ability to help greatly) but am happy to try to help if need be.

earwig commented 2 years ago

Thanks for offering to help! The parser has a Python reference implementation; I'm happy to accept PRs for fixes only made there which I can then port to the C parser.

However, I'll have some time later this week to investigate, so hopefully that's not necessary.

geohci commented 2 years ago

Many thanks! I'll wait for your initial investigation then.

Unrelated but huge thanks for this work -- I've used mwparserfrom for a while for all sorts of tasks. At the moment, I'm working on a Python package for describing what happens in a given Wikipedia diff (e.g., 3 templates changed, 2 categories added, etc.) and mwparserfromhell has been so powerful and useful -- this is a bug I've stumbled across but it's very minor in the grand scheme of things.

Jontpan commented 2 years ago

Hi I also have this problem, any updates?

earwig commented 2 years ago

This is fixed in v0.6.4, just released. Thanks for everyone's patience.

geohci commented 2 years ago

Thanks!!