phfaist / pylatexenc

Simple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion
https://pylatexenc.readthedocs.io
MIT License
301 stars 37 forks source link

convert ₀:$_0$ ... ₉:$_9$ #72

Open WolfgangFahl opened 2 years ago

WolfgangFahl commented 2 years ago
No known latex representation for character: U+2080 - ‘₀’
₀→₀
No known latex representation for character: U+2081 - ‘₁’
₁→₁
No known latex representation for character: U+2082 - ‘₂’
₂→₂
No known latex representation for character: U+2083 - ‘₃’
₃→₃
No known latex representation for character: U+2084 - ‘₄’
₄→₄
No known latex representation for character: U+2085 - ‘₅’
₅→₅
No known latex representation for character: U+2086 - ‘₆’
₆→₆
No known latex representation for character: U+2087 - ‘₇’
₇→₇
No known latex representation for character: U+2088 - ‘₈’
₈→₈
No known latex representation for character: U+2089 - ‘₉’
for code in range(8320,8330):
           s=s.replace(chr(code),f"$_{code-8320}$")
WolfgangFahl commented 2 years ago

workaround

@staticmethod
    def uniCode2Latex(text:str)->str:
        '''
        converts unicode text to latex and 
        fixes UTF-8 chars for latex in a certain range:
            ₀:$_0$ ... ₉:$_9$

        see https://github.com/phfaist/pylatexenc/issues/72

        Args:
            text(str): the string to fix

        Return:
            str: latex presentation of UTF-8 char
        '''
        for code in range(8320,8330):
            text=text.replace(chr(code),f"$_{code-8320}$")
        return unicode_to_latex(text)      

unit test

def testUnicode2LatexWorkaround(self):
        '''
        test the uniCode2Latex conversion workaround
        '''
        debug=True
        for code in range(8320,8330):
            uc=chr(code)
            latex=QueryResultDocumentation.uniCode2Latex(uc)
            if debug:
                print(f"{uc}→{latex}")
            #self.assertTrue(latex.startswith("$_"))
        unicode="À votre santé!"
        latex=QueryResultDocumentation.uniCode2Latex(unicode)
        if debug:
            print(f"{unicode}→{latex}")
        self.assertEqual("\\`A votre sant\\'e!",latex)
WolfgangFahl commented 2 years ago

result:

Starting test testUnicode2LatexWorkaround, debug=False ...
₀→\$\_0\$
₁→\$\_1\$
₂→\$\_2\$
₃→\$\_3\$
₄→\$\_4\$
₅→\$\_5\$
₆→\$\_6\$
₇→\$\_7\$
₈→\$\_8\$
₉→\$\_9\$
À votre santé!→\`A votre sant\'e!
test testUnicode2LatexWorkaround, debug=False took   0.0 s
----------------------------------------------------------------------
Ran 1 test in 0.002s