jameslittle230 / stork

🔎 Impossibly fast web search, made for static sites.
https://stork-search.net
Apache License 2.0
2.73k stars 56 forks source link

Ruby text isn't handled properly #341

Open ElnuDev opened 1 year ago

ElnuDev commented 1 year ago

When indexing posts that use the <ruby> tag, for example かん <ruby>漢<rp>(</rp><rt>かん</rt><rp>)</rp></ruby><ruby>字<rp>(</rp><rt>じ</rt><rp>)</rp></ruby>, the ruby text isn't handled properly. All of the content of <rt> and <rp> should be ignored, as they aren't a part of the actual text content, just annotations. Currently, Stork just expands the <rp> parentheses, so the former becomes 漢(かん)字(じ). Since words can be broken up across multiple <ruby> tags, this causes かん to not come up in search results for 漢字.