apostrophecms / sanitize-html

Clean up user-submitted HTML, preserving whitelisted elements and whitelisted attributes on a per-element basis. Built on htmlparser2 for speed and tolerance
MIT License
3.79k stars 353 forks source link

value of `undefined` returns unexpected `"undefined"` #502

Closed shellscape closed 2 years ago

shellscape commented 2 years ago

To Reproduce

Step by step instructions to reproduce the behavior:

var sanitizeHtml = require("sanitize-html")

sanitizeHtml(undefined)
// > "undefined"

Expected behavior

I expected undefined to be returned

Describe the bug

Since undefined itself is not a string and has to toString, it appears to be a bug that it's being turned into a string containing "undefined". In most cases I'd wager that passing undefined means that a field, property, or value has been intentionally set to undefined and should be pass-through as there's no value there to operate on. There seems to be an assumption that the intent is to stringify undefined which is not a common pattern in JS.

JSON.stringify(undefined) returns plain undefined as well, and I would consider that to be the defacto standard.

Details

Version of Node.js: 14.15.3 Server Operating System: Amazon Linux / MacOS Additional context: Screenshots

boutell commented 2 years ago

There is a change already scheduled for release such that this would gracefully return the empty string. It won't return undefined as a string is the expected output. (Also the expected input! But we made it more tolerant.)

On Sat, Oct 9, 2021 at 11:02 AM Andrew Powell @.***> wrote:

To Reproduce

Step by step instructions to reproduce the behavior:

var sanitizeHtml = require("sanitize-html") sanitizeHtml(undefined)// > "undefined"

Expected behavior

I expected undefined to be returned Describe the bug

Since undefined itself is not a string and has to toString, it appears to be a bug that it's being turned into a string containing "undefined". In most cases I'd wager that passing undefined means that a field, property, or value has been intentionally set to undefined and should be pass-through as there's no value there to operate on. There seems to be an assumption that the intent is to stringify undefined which is not a common pattern in JS.

JSON.stringify(undefined) returns plain undefined as well, and I would consider that to be the defacto standard. Details

Version of Node.js: 14.15.3 Server Operating System: Amazon Linux / MacOS Additional context: Screenshots

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/apostrophecms/sanitize-html/issues/502, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27JCQXDAD2FQKEEE7I3UGBKPZANCNFSM5FVMDMYQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his

shellscape commented 2 years ago

it would have been cool to discuss my use case and the evidence that I provided for an ecosystem normal behavior for the package, before closing the issue.

boutell commented 2 years ago

Hi Shellscape, I don't mean to be inconsiderate of your use case. However this module already returns a string, and code that assumes a string will always be the result could actually crash if we changed that behavior. So for bc reasons this is not something we are really free to change. Whereas it is not difficult for you to wrap sanitize-html in a function that checks for undefined and returns undefined.

As for ecosystem normal behavior, the "validator" module is heavily used, and if you pass it undefined it throws an error, it doesn't return undefined:

https://www.npmjs.com/package/validator