flavorjones / loofah

Ruby library for HTML/XML transformation and sanitization
MIT License
934 stars 138 forks source link

pass encode_special_chars to to_s #270

Closed gamesover closed 1 year ago

gamesover commented 1 year ago
2.5.1 :002 > Loofah.fragment("Ruby & Rails").to_text(encode_special_chars: false)
 => "Ruby & Rails"

I want to pass encode_special_chars to to_s

unsafe_html = "ohai! <div>ruby & rails</div> <script>but script is not</script>"

doc = Loofah.html5_fragment(unsafe_html).scrub!(:prune)
doc.to_s(encode_special_chars: false)    # => "ohai! <div>ruby & rails</div> "

The current result is "ohai! <div>ruby &amp; rails</div> " which is not I want

flavorjones commented 1 year ago

@gamesover Thanks for asking about this.

The intention of #to_text is to render plain text output. The intention of #to_s is to render markup.

In the case of "plain text", this may be rendered in a web browser (in which case, entities like &amp; should be used -- this is the default behavior), or it may be rendered in a non-web context like an SMS message (this is the reason encode_special_chars: false exists as a feature).

In the case of "markup", we emit HTML markup that meets the HTML spec, so &amp; is the correct thing to do. The string <div>ruby & rails</div> is invalid HTML. A browser will handle the parse error, and render it as ruby &amp; rails, but it is still invalid HTML.

Can you say a bit more about your use case? Why do you want this behavior?