JavaScript Injection using event attributes

Qendolin commented 2 years ago

Script injection is still very easy to achieve. As mentioned in https://github.com/JiLiZART/BBob/issues/66#issuecomment-653926541 you can inject JavaScript via on* attributes. [aaa onclick=alert('Hacked')]Click Me[/aaa]

I don't think the solution is to blacklist certain attributes. IMO it shouldn't even be possible to set any attributes unless explicitly enabled in a preset. Even a style attribute can be used maliciously.

Furthermore it is not save to allow the creation of an arbitrary html tag. For example some websites might use custom elements which themselves might be exploitable.

I'ld strongly prefer it this library was safe by default.

JiLiZART commented 2 years ago

This library has a layered structure. Raw @bbob/parser can be used to transform bbcodes to AST, or @bbob/core for plugin support. I don't include whole html5 attributes validation and tags spec cause the main goal of this library is to be tiny and raw so you can use it to create your own solutions.

Anyway library was safe by default is a good point. I will try to investigate how I can achieve this without bloating this library.

Qendolin commented 2 years ago

I think the biggest problem is creating HTML as a string. Using the DOM is a lot safer as it prevents any kind of escape attempt.

For my site I've implemented a custom renderNode like this:

const renderNode = (node, { stripTags = false }) => {
    if (!node) return document.createTextNode('');
    const type = typeof node;

    if (type === 'string' || type === 'number') {
        return document.createTextNode(node);
    }

    if (Array.isArray(node)) {
        return renderNodes(node, { stripTags });
    }

    if (type === 'object') {
        if (stripTags === true) {
            return renderNodes(node.content, { stripTags });
        }

        const element = document.createElement(node.tag);
        for (const name in node.attrs) {
            try {
                element.setAttribute(name, node.attrs[name]);
            } catch (ignored) {
                // Ignore invalid attribute names
            }
        }

        if (node.content === null) {
            return element;
        }

        element.append(...renderNodes(node.content));
        return element;
    }

    return document.createTextNode('');
};

const renderNodes = (nodes, { stripTags = false }) => {
    return nodes.flatMap((node) => renderNode(node, { stripTags }));
};

const toHTML = (source, plugins, options) =>
    core(plugins).process(source, {
        ...options,
        render: (nodes, opts) => {
            // Document fragment is just a container element, so that the function only returns a single element
            const frag = document.createDocumentFragment();
            frag.append(...renderNodes(nodes, opts));
            return frag;
        }
    }).html;

To make attributes and tag names whitlist-based I've also created my own processor

/**
 * @param name the property name
 * @param value the property value, will be escaped
 * @returns The css style definition
 */
const renderStyle = (name, value) => {
    value = value.replaceAll(`"`, ``).replaceAll(`'`, '').replaceAll(`;`, '');
    return `${name}: ${value};`;
};

// Contains allowed styles
const styleMap = {
    color: (val) => renderStyle('color', val)
}

/**
 * @param attrs 
 * @returns The constructed style attribute
 */
const getStyleFromAttrs = (attrs) => {
    let style = '';
    for (const name in styleMap) {
        if (name in attrs) {
            style += styleMap[name](attrs[name]);
        }
    }
    return { style };
};

/**
 * @param attrs 
 * @returns A style with the color of the unique attribute
 */
const getColorStyle = (attrs) => {
    return getStyleFromAttrs({
        color: getUniqAttr(attrs) ?? 'inherit'
    });
};

/**
 * @param node 
 * @param render the render function
 * @returns the url from the unique attribute or the node content
 */
const getUrlHref = (node, render) => {
    let url = getUniqAttr(node.attrs);
    url ??= render(node.content, { stripTags: true })?.textContent ?? '';
    return url;
};

/**
 * @param node 
 * @param render the render function
 * @returns the url from the node cpntent
 */
const getImgSrc = (node, render) => {
    if (typeof node.content == 'string') return node.content;
    return render(node.content, { stripTags: true })?.textContent ?? '';
};

/**
 * Creates a trusted node.
 * Tag must be a trusted tag
 * Attrs must only contain trusted attributes
 */
const toNode = (tag, attrs, content) => ({
    tag,
    attrs,
    content,
    trusted: true
});

export const definitions = {
    b: (node) => toNode('b', {}, node.content),
    i: (node) => toNode('i', {}, node.content),
    u: (node) => toNode('span', { style: `text-decoration: underline;` }, node.content),
    s: (node) => toNode('s', {}, node.content),
    br: (node) => toNode('br', {}, node.content),
    url: (node, { render }) =>
        toNode(
            'a',
            {
                href: getUrlHref(node, render),
                // Also an important privacy improvement
                rel: 'noreferrer noopener nofollow',
                target: '_blank'
            },
            node.content
        ),
    img: (node, { render }) =>
        toNode(
            'img',
            {
                src: getImgSrc(node, render),
                // For privacy
                referrerpolicy: 'no-referrer',
                // For cross origin images
                crossorigin: 'anonymous',
                width: node.attrs.width,
                height: node.attrs.height
            },
            []
        ),
    quote: (node) => toNode('blockquote', {}, node.content),
    style: (node) => toNode('span', getStyleFromAttrs(node.attrs), node.content),
    color: (node) => toNode('span', getColorStyle(node.attrs), node.content),
    // ... and more (I left some out for brevity)
};

function process(tags, tree, core, options) {
    tree.walk((node) => {
        if (isTagNode(node)) {
            // Create a trusted node from whitelisted tags
            if (tags[node.tag]) {
                return tags[node.tag](node, core, options);
            }
            if (node.trusted === true) {
                // trusted nodes can be any tag or contain attributes that are not whitelisted
                return node;
            }
            // Convert all non-whitelisted tags to spans and remove all attributes
            return {
                tag: 'span',
                attrs: {},
                content: node.content,
                trusted: true
            };
        }
        return node;
    });
}

The most important differences are:

It converts any unknown tag to a <span> element
It removes all attributes that are not used by the tag definition
It escapes user specified values
Even if user specified values aren't escaped properly they can't cause harm because we are using the DOM to create the elements instead of string concatenation
Creating elements using the DOM is faster than setting innerHTML
It improves user privacy by setting rel and referrerpolicy to appropriate values

JiLiZART / BBob

JavaScript Injection using event attributes #148