molybdenum-99 / infoboxer

Wikipedia information extraction library
MIT License
174 stars 16 forks source link

Change approach to markup fail #55

Open zverok opened 8 years ago

zverok commented 8 years ago

Unfortunately, The Real Life™ shows even old and reliable Wikipedia pages can have dumb markup fails. For example, recently there was [[Indian people|Indian] (note not enough closing brackets) in Libya country page.

So, Infoboxer's approach "fail completely on invalid markup" is not enough. Invalid markup should be parsed somehow, correct approach should be "not fail in any case" (maybe, there should be an option to swithc between those two).