Open lkarthee opened 2 years ago
Hi @lkarthee, feel free to send PRs for those changes. I think aligning with pygments can be positive. This is mostly a community project, so if it can be made more useful, contributions are welcome!
Hi @lkarthee, thank you so much for providing your feedback. ❤️
This project has been experimental as it has been developed with the single objective of implementing an HTML lexer for makeup. That's one of the reasons I decided to follow HTML5 syntax as it seemed easier at first. As @josevalim said, any contribution is gladly welcome and the three of your concerns (i.e., aligning with pygments, weakening the syntax and deactivation of data-group-ids via flags) are quite interesting features for the project IMHO.
Thank you @josevalim and @javiergarea - I appreciate your effort in maintaining this project.
I am using a fork with changes I made in a project(bootstrap library for phoenix components). I am fixing some bugs with attributes highlighting - i will send pr, once those changes are stable? (May be in a week).
For now I am using with following changes:
def not_keywords_stringify(tokens) do
not_keywords_stringify(tokens, {0, []}, [])
end
def skip_whitespace(tokens, token) do
queue =
Enum.reduce_while(tokens, [token], fn t, acc ->
case t do
{:string , tup, list} ->
{:halt, acc ++ [{:name_tag, tup, list}]}
{:keyword, tup, list} ->
{:halt, acc ++ [{:name_tag, tup, list}]}
_ ->
{:cont, acc ++ [t]}
end
end)
{_, tokens} = Enum.split(tokens, length(queue) - 1)
{queue, tokens}
end
def not_keywords_stringify(
[{:punctuation, _, "<"} = token | tokens],
{id, []} = queue_tuple,
result) do
{queue, tokens} = skip_whitespace(tokens, token)
not_keywords_stringify(tokens, {id + 1, []}, result ++ queue)
end
def not_keywords_stringify(
[{:punctuation, _, "<"} = token | tokens],
{id, orig_queue} = queue_tuple,
result) do
{queue, tokens} = skip_whitespace(tokens, token)
not_keywords_stringify(tokens, {id + 1, []}, result ++ orig_queue ++ queue)
end
def not_keywords_stringify(
[{:punctuation, _, "</"} = token | tokens],
{id, []} = queue_tuple,
result) do
{queue, tokens} = skip_whitespace(tokens, token)
not_keywords_stringify(tokens, {id + 1, []}, result ++ queue)
end
def not_keywords_stringify(
[{:punctuation, _, "</"} = token | tokens],
{id, orig_queue} = queue_tuple,
result) do
{queue, tokens} = skip_whitespace(tokens, token)
not_keywords_stringify(tokens, {id + 1, []}, result ++ orig_queue ++ queue)
end
def not_keywords_stringify(
[{:punctuation, _, ">"} = token | tokens],
{id, queue} = queue_tuple,
result) do
{queue, _} =
Enum.reduce(queue, {[], nil},fn {type, mid, data} = curr, {acc, prev} ->
prev_type =
case prev do
nil ->
nil
{prev_type, _, _} ->
prev_type
end
cond do
prev_type == nil and type == :string ->
{acc ++ [{:name_attribute, mid, data}], curr}
prev_type == :whitespace and type == :string ->
{acc ++ [{:name_attribute, mid, data}], curr}
true ->
{acc ++ [curr], curr}
end
end)
not_keywords_stringify(tokens, {id, []}, result ++ queue ++ [token])
end
def not_keywords_stringify([token | tokens] , {id, queue} = queue_tuple, result) do
not_keywords_stringify(tokens, {id, queue ++ [token]}, result)
end
def not_keywords_stringify([], {_id, queue}, result),
do: result ++ queue
@impl Makeup.Lexer
def postprocess(tokens, _opts \\ []) do
tokens
|> char_stringify()
|> commentify()
|> keyword_stringify()
|> attributify()
|> element_stringify()
|> not_keywords_stringify()
end
makeup_html
produces different tokens when compared toPygments
.Makeup styles html tags which are recognised using
k
which iskeyword
rather than styling it withnt
which isname_tag
.Makeup styles html tags which are not recognised using
s
which isstring
rather than styling it withnt
which isname_tag
. Also it styles unrecognised attributes withs
which isstring
rather thanna
which isname_attribute
.Here are my queries after using Makeup html :