TidyEx corrects and cleans up HTML content by fixing markup errors.
Elixir/Erlang bindings for htacg's tidy-html5
The granddaddy of HTML tools, with support for modern standards http://www.html-tidy.org
The binding is implemented as a C-Node following the excellent example in Overbryd's package nodex. If you want to learn how to set up bindings to C/C++, you should definitely check it out.
C-Nodes are external os-processes that communicate with the Erlang VM through erlang messaging. That way you can implement native code and call into it from Elixir in a safe predictable way. The Erlang VM stays unaffected by crashes of the external process.
For more examples please checkout tests.
test "can parse broken html" do
result = TidyEx.parse("<div>Hello<span>World")
assert result == "<div>Hello<span>World</span></div>"
end
test "can clean and repair broken html" do
result = TidyEx.clean_and_repair("<div>Hello<span>World")
assert result == "<div>Hello<span>World</span></div>"
end
test "can run diagnostics on invalid html" do
result = TidyEx.run_diagnostics("<pp>Hello World</p>")
assert result == "line 1 column 1 - Error: <pp> is not recognized!\nThis document has errors that must be fixed before\nusing HTML Tidy to generate a tidied up version."
end
Available on hex.
def deps do
[
{:tidy_ex, "~> 0.1.0-dev"}
]
end
cmake 3.x
erlang-dev
erlang-xmerl
erlang-parsetools
mix deps.get
mix compile
mix test
git clone git@github.com:f34nk/tidy_ex.git
cd tidy_ex
All binding targets are added as submodules in the target/
folder.
git submodule update --init --recursive --remote
mix deps.get
mix compile
mix test
mix test.target
Cleanup
mix clean
See CHANGELOG.
Broom by faisalovers from the Noun Project