Closed xxyzz closed 4 months ago
Fixes tatuylonen/wiktextract#535
It works for now, but we'll have more errors with this regex in the future, only using a real HTML parser could fix them.
This seems like a slam dunk.
Probably negligible for simple patterns. Speaking of performance, extract fr edition time drops back to 40 minutes, maybe some commits after #238 improved the speed.
Fixes tatuylonen/wiktextract#535
It works for now, but we'll have more errors with this regex in the future, only using a real HTML parser could fix them.