Handle inline transclusion differently in plaintext extraction #41

Open appledora opened 2 years ago

In GitLab by @geohci on Aug 30, 2022, 24:21

Example: for the en:Cabbage article, the second paragraph of plaintext skipping transclusion is A cabbage generally weighs between . because the HTML is actually A cabbage generally weighs between <span about="#mwt15" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"convert","href":"./Template:Convert"},"params":{"1":{"wt":"500"},"2":{"wt":"to"},"3":{"wt":"1000"},"4":{"wt":"g"},"5":{"wt":"lbs"},"sigfig":{"wt":"1"}},"i":0}}]}' id="mwHw">500 to 1,000 grams (1 to 2 lb). and the wikitext is A cabbage generally weighs between {{convert|500|to|1000|g|lbs|sigfig=1}}.

Maybe we can have an option that only excludes transclusion when it happens inside certain types of elements instead of being the parent element?

appledora / mwparserfromhtml

Handle inline transclusion differently in plaintext extraction #41