I have modified the base_pattern regex to include (?:\[(?:.*?)\])* which is a non-greedy match for zero or more occurrences of brackets that serve as optional arguments in LaTeX.
I have also added
def extract_text_inside_curly_braces(text):
"""Extract text inside of {} from command string"""
pattern = r"\{((?:[^{}]|(?R))*)\}"
match = regex.search(pattern, text)
if match:
return match.group(1)
else:
return ''
which serves to extract the text from nested or non-nested commands if keep_text is set to true.
Tests to ensure proper functionality have also been added!
Thank you very much @dylduhamel!
When I saw the pattern r'(?:\[(?:.*?)\])*\{((?:[^{}]+|\{(?1)\})*)\}(?:\[(?:.*?)\])*' I couldn't resist thinking: Isn't regex a lovely intuitive language? 😝
This change resolves #48.
I have modified the
base_pattern
regex to include(?:\[(?:.*?)\])*
which is a non-greedy match for zero or more occurrences of brackets that serve as optional arguments in LaTeX.I have also added
which serves to extract the text from nested or non-nested commands if
keep_text
is set to true.Tests to ensure proper functionality have also been added!