Open Moonbase59 opened 2 years ago
I did not have any case where subscript appeared, and adding things without a need and a test-case seemed not a good idea. If you have an example I can look into it.
Or if you write and test a patch yourself that is also an option - probably the approach should use a single regexp for both cases to avoid going through the data twice.
In the German Wiktionary, https://de.wiktionary.org/wiki/H%E2%82%82O would be one. For content, the entry https://de.wiktionary.org/wiki/Alkohol might also be useful (contains C₂H₅OH).
In https://github.com/rdoeffinger/DictionaryPC/blob/509e1fa70a1c9f03a329fcc6df982eb7c341b5ea/src/com/hughes/android/dictionary/parser/wiktionary/AbstractWiktionaryParser.java#L67-L93 you replace superscript arabic numerals against their Unicode equivalents, why not subscript numerals, too? (Unicode range U+2080..U+2089)
This might help with line spacing issues (except if it were for footnotes/endnotes only).