Psycojoker / ipython-beautifulsoup

Pretty HTML/XML rendering with syntax highlighting for BeautifulSoup objects in IPython notebook and qtconsole.
68 stars 11 forks source link

SEC: XSS: JS/CSS Injection #2

Open westurner opened 10 years ago

westurner commented 10 years ago

CSS styles of displayed pages override the IPython notebook interface styles with URLs like e.g. http://downforeveryoneorjustme.com.

Without using IFrames, I'm not sure whether it's possible to avoid this without something like:

def cleaned_beautifulsoup_copy(soup):
    copy = BeautifulSoup(unicode(soup))
    for node in copy('script'):
        node.extract()
    # <here
    for node in copy('style'):
        node.extract()
    # /here>
    return copy

EDIT: Also JS.

westurner commented 10 years ago

[Also] a shortcut for rendering just the prettified HTML version (and/or a conditional contextual setting for defaulting to said behavior) would be outstanding. What an excellent teaching tool.

Psycojoker commented 10 years ago

I'm not closing this one after the pull request since the problem is still here. Iframe looks like a good idea, maybe I'll explore it.

westurner commented 10 years ago

Good call. On further review, it looks like IFrames may introduce additional (those possibly lesser) concerns.

For reference, this is a Cross-Site Scripting (XSS) concern:

CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

As the warning in configure_ipython_beautifulsoup explains, removing ('extracting') explicit <script> and <style> tags doesn't address:

Approaches

Whitelisting

The HTML tags and attributes could be processed through a whitelis with something like https://pypi.python.org/pypi/bleach, but:

IFrames

IFrames may be the 'safest' bet, though IFrames do introduce additional domain concerns.