Closed RossWilliamson closed 3 years ago
Hi, thanks for the issue. You can already achieve the desired result by specifying custom text conversions (see https://pylatexenc.readthedocs.io/en/latest/latex2text/):
from pylatexenc import latex2text
l2t_context = latex2text.get_default_latex_context_db()
l2t_context.add_context_category('preserve-custom-macros', prepend=True, macros=[
latex2text.MacroTextSpec('ref', simplify_repl=r'\ref{%(1)s}')
],)
l2t = latex2text.LatexNodes2Text(latex_context=l2t_context)
latex = r'\emph{For the definition of $\alpha$, see also:} \ref{eq:a} \& \ref{eq:b}'
converted = l2t.latex_to_text(latex)
print(converted)
# outputs → For the definition of α, see also: \ref{eq:a} & \ref{eq:b}
I'm closing this issue, feel free to reopen if I'm missing anything.
Thanks! I was wondering how you do this for the latexencode vs latex2text. I have a string which has a deliberate "\ref" in there that I need to preserve. I tried the following:
from pylatexenc import latexencode
cr = [ latexencode.UnicodeToLatexConversionRule(latexencode.RULE_REGEX, [
(re.compile(r'\\ref'), r'\\ref'),
], replacement_latex_protection='none'),
'defaults'
]
u_to_l = latexencode.UnicodeToLatexEncoder(conversion_rules=cr)
u_to_l.unicode_to_latex(r'\ref{sec:pp:qq}')
but it returns \ref{sec:pp:qq} - i.e. it escapes the curly brackets which i not wanted
Try:
import re
from pylatexenc import latexencode
cr = [
latexencode.UnicodeToLatexConversionRule(latexencode.RULE_REGEX, [
(re.compile(r'\\ref\{([^\}]+)\}'), r'\\ref{\1}'),
], replacement_latex_protection='none'),
'defaults'
]
u_to_l = latexencode.UnicodeToLatexEncoder(conversion_rules=cr)
print( u_to_l.unicode_to_latex(r'See \ref{sec:pp:qq} for α=β') )
# prints: See \ref{sec:pp:qq} for \ensuremath{\alpha}=\ensuremath{\beta}
Also, using this regular expression rule, no escaping will happen within the argument of the \ref macro.
It would be good to have an exception list when doing the conversion. For example I would like to keep \ref as \ref in order to put label markers in prior to latex. Right now that gets printed just as \ref.