apostrophecms / sanitize-html

Clean up user-submitted HTML, preserving whitelisted elements and whitelisted attributes on a per-element basis. Built on htmlparser2 for speed and tolerance
MIT License
3.68k stars 349 forks source link

Fix auto-adding escaped closing tags #607

Closed dliebner closed 1 year ago

dliebner commented 1 year ago

Summary

Closes #464 Adds tagAllowed() helper function.

What are the specific steps to test this change?

Existing tests may need to be reviewed. I don't do a lot of PRs, sorry.

What kind of change does this PR introduce?

(Check at least one)

Make sure the PR fulfills these requirements:

If adding a new feature without an already open issue, it's best to open a feature request issue first and wait for approval before working on it.

Other information:

BoDonkey commented 1 year ago

Hi @dliebner Thanks for giving this a go! As a first step, can you at least describe how your new function should work? Does the tagAllowed() take a boolean, or...? You can use the original example to provide the input and expected output. "<p>here is a string with a <wacky> tag.</p>" Thanks! Bob

dliebner commented 1 year ago

tagAllowed() takes a tag name and checks it against options.allowedTags and returns true if the tag is allowed and false if it is not.

const wackyText = sanitizeHtml('<p>here\'s a string with a <wacky> tag.', {
    disallowedTagsMode: 'escape',
});

console.log( wackyText ); // expected: "<p>here's a string with a &lt;wacky&gt; tag.</p>"
BoDonkey commented 1 year ago

Hi @dliebner, I'm still not quite getting it. I wrote the following test:

it('Should allow unclosed tags', function() {
    assert.equal(
      sanitizeHtml('<p>Text with a <wacky> tag.</p>', {
        allowedTags: [ 'wacky' ],
        disallowedTagsMode: 'escape'
      }), '<p>Text with a <wacky> tag.</p>'
    );
  });

This test fails with the output being: &lt;p&gt;Text with a <wacky> tag.</wacky>&lt;/p&gt; While I expected: &lt;p&gt;Text with a &lt;wacky&gt; tag.&lt;/p&gt; based on the original issue. Any additional feedback? Thanks, Bob

dliebner commented 1 year ago

In your test, you set your allowedTags to ['wacky'], meaning only the wacky tag is allowed. All other tags will be escaped using HTML entities because disallowedTagsMode is set to escape. The opening <p> is correctly escaped as it is not allowed. The opening <wacky> tag is correctly preserved as it is allowed, and it is correctly auto-closed as it is allowed. The closing </p> tag is correctly escaped because it is not allowed (and is literally included in the input, as opposed to being only implied by the presence of an opening tag).