dcwatson / bbcode

A pure python bbcode parser and formatter.
BSD 2-Clause "Simplified" License
68 stars 17 forks source link

Formatters should only transform their contents and not the contents of their children #23

Closed zaygraveyard closed 8 years ago

zaygraveyard commented 8 years ago

How to reproduce

import bbcode

def render(bbcode_text):
    parser = bbcode.Parser()
    parser.add_simple_formatter('left', '<div class="bb-left">%(value)s</div>')
    parser.add_simple_formatter(
        'code',
        '<code>%(value)s</code>',
        same_tag_closes=True,
        render_embedded=False,
        transform_newlines=False,
        escape_html=False,
        replace_links=False,
        replace_cosmetic=False,
        strip=True,
        swallow_trailing_newline=True
    )
    return parser.format(bbcode_text)

print render('[left]a\nb[code]c\nd[/code]\ne\nf[/left]')

Expected output

<div class="bb-left">a<br>b<code>c\nd</code><br>e<br>f</div>

Actual output

<div class="bb-left">a<br>b<code>c<br>d</code><br>e<br>f</div>

Note the <br> inside the <code>c<br>d</code> instead of the \n.

Version

BBCode 1.0.22 Python 2.7.12

dcwatson commented 8 years ago

This is definitely an issue, but is going to take some significant refactoring to fix. The problem is that the tags are basically rendered inside-out, so the outside tag (marked with transform_newlines) has no knowledge of the inner tags, just the final rendered markup. I have some ideas for how to fix it (by creating a tree of render blocks), but it may be some time before I get around to doing it.

zaygraveyard commented 8 years ago

Thank you for the quick response. Let me know if I can help out.

After looking at the code, I see that this problem can be fixed (at least for now) by moving these 4 lines inside the else branch. Am I correct?

dcwatson commented 8 years ago

Unfortunately, it's not that simple. In that scenario, newlines wouldn't be converted for tags that render embedded tags. So text inside a tag, but outside any inner tags, would not have newlines converted.

dcwatson commented 8 years ago

It's a bit clever for my taste, but this should do the trick for now. It relies on normalizing all newlines to \n, then replacing any that should be transformed to <br /> with \r, so that string stripping still works as expected. Then any \r are replaced with <br /> at the very end.

Let me know if this works for you, and if so I'll push out a new release.

zaygraveyard commented 8 years ago

Thank you, I'll try it out and let know.

zaygraveyard commented 8 years ago

It works great, thank you!

dcwatson commented 8 years ago

Posted a new version to https://pypi.python.org/pypi/bbcode

zaygraveyard commented 8 years ago

Thank you