JuliaWeb / Gumbo.jl

Julia wrapper around Google's gumbo C library for parsing HTML
Other
154 stars 25 forks source link

properly escape xml entities and overhaul printing #80

Closed pfitzseb closed 4 years ago

pfitzseb commented 4 years ago

This PR makes sure that we never emit unescaped xml entities (<, >, & in text context; additionally " and ' in attributes) except in script or style tags.

The printing overhaul also simplifies the code a bit and allows us to always respect relevant whitespace (in pre, textarea, script, and style tags), no matter whether pretty printing is enabled or not.

I've also enabled short-form printing for tags that aren't allowed to have content.

These changes are technically breaking, I think.

pfitzseb commented 4 years ago

AppVeyor fails because we don't have any config in the repo.

porterjamesj commented 4 years ago

Looks good, thanks for cleaning this up! That printing code was some of the worst in the codebase, both in terms of the code itself and the output. I removed the Appveyor hook so that shouldn't happen in the future.