newcommands parsing is broken

newcommands parsing is broken as any newcommand replacement that is applied to text containing brackets with stop at the first closing bracket, e.g.

\newcommand{\normg}[1]{\lvert\lvert {#1} \rvert\rvert}
\normg{x = \sum_{i}^{P} f(d_i)}

will result in

\normg{\lvert\lvert x = {\sum_{i} \rvert\rvert f(d_i)}^{P}

I believe is due to the limitations of the regex, but I'm not very competent with regex. I believe what is required is to some how match the brackets, further for nested newcommands it is unclear how many levels should be evaluated. In html, latex, and ipynb (as of hplgit/doconce/pull/136 it is fine to leave newcommands, thus it is question of how to handle this for other formats.

I had started on a quick fix for non-nested newcommands with only 1 argument, which is

def recursive_bracket_parser(s, i):
    """ Inspired by <https://stackoverflow.com/a/14952529/4000607>"""
    while i < len(s):
        if s[i] == r'{' and (i<1 or s[i-1] != r'\\'):
            i = recursive_bracket_parser(s, i+1)
        elif s[i] == r'}' and (i<1 or s[i-1] != r'\\'):
            return i+1
        else:
            # process whatever is at s[i]
            i += 1
    return i

and an example mirror substitute in expand_newcommands.py

newcommands_test= [(r'\\normg', r'\\lvert\\lvert {NEWCOMMANDARG} \\rvert\\rvert}',1),
(r'\\normf', r'\\normg{NEWCOMMANDARG}_{NEWCOMMANDARG}', 2)] 

for pattern, replacement, nargs in newcommands_test:
    # 0 check if replacement at begining of string
    m = re.search(pattern, text)
    if m and m.start==0:
        first_match=0
    else:
        first_match=1
    # 1 Find all matches
    matches = re.split(pattern, text)
    # 2 process each match
    for match in matches[first_match:]: 
        #print(match, len(match))
        args = []
        for idx in range(nargs):
            end_arg = recursive_bracket_parser(match,1)
            args.append(match[0:end_arg])
            match = match[end_arg:]

        tmp = replacement
        for idx, arg in enumerate(args):
            print(tmp) 
            print(arg)
            tmp, n = re.subn(r'{NEWCOMMANDARG}', arg, tmp, count=1)
        print(tmp)
        #tmp =  replacement.format(*args) + match
        #@print(tmp)

I can continue with this line of work, but I think just including the newcommands in ipython, latex, and html is fine for my uses of doconce. Further I'm not sure if this is the right path to go down, but perhaps it could help with issues others have.

hplgit / doconce

newcommands parsing is broken #137