kostya / lexbor

Fast HTML5 Parser with CSS selectors. This is successor of myhtml and expected to be faster and use less memory.
MIT License
95 stars 14 forks source link

tag_name is randomized on text node #9

Closed Dan-Do closed 3 years ago

Dan-Do commented 3 years ago
def walk(node, level = 0)
  puts "#{" " * level * 2}<#{node.tag_name}> ~ #{node.inspect}"
  node.children.each { |child| walk(child, level + 1) }
end

tpl_html = <<-TEMPLATE
<section id="test" class="{{ this.state.theme }}" data-index="{{= index }}" click="test">
  {{@ var is_today = data.date === view.today; var result = (function(){ return "compare a < b"; }()) }}
  <div class="{{ data.class }} {{ is_today ? 'on' : 'off' }}" click="test">
    <div class="title" style="font-size: 2em">{{ data.title.toUpperCase() }}</div>
    <div class="content {{ index % 2 ? 'odd' : 'even' }}">{{# data.content }}</div>
    <div class="footer">{{ view.parseFooter(data) }}</div>
  </div>
</section>
TEMPLATE

parser = Lexbor::Parser.new(tpl_html)
walk(parser.root!)

Here is the result:

<html> ~ Lexbor::Node(:html)
  <head> ~ Lexbor::Node(:head)
  <body> ~ Lexbor::Node(:body)
    <section> ~ Lexbor::Node(:section, {"id" => "test", "class" => "{{ this.state.theme }}", "data-index" => "{{= index }}", "click" => "test"})
      <isindex> ~ Lexbor::Node(:_text, "
  {{@ var is_today = data.dat...")
      <div> ~ Lexbor::Node(:div, {"class" => "{{ data.class }} {{ is_today ?...", "click" => "test"})
        <!doctype> ~ Lexbor::Node(:_text, "
    ")
        <div> ~ Lexbor::Node(:div, {"class" => "title", "style" => "font-size: 2em"})
          <blockquote> ~ Lexbor::Node(:_text, "{{ data.title.toUpperCase() }}")
        <!doctype> ~ Lexbor::Node(:_text, "
    ")
        <div> ~ Lexbor::Node(:div, {"class" => "content {{ index % 2 ? 'odd' :..."})
          <article> ~ Lexbor::Node(:_text, "{{# data.content }}")
        <!doctype> ~ Lexbor::Node(:_text, "
    ")
        <div> ~ Lexbor::Node(:div, {"class" => "footer"})
          <big> ~ Lexbor::Node(:_text, "{{ view.parseFooter(data) }}")
        <#document> ~ Lexbor::Node(:_text, "
  ")
      <#end-of-file> ~ Lexbor::Node(:_text, "
")

On text node, the tag_sym is always :_text but tag_name is assigned random of isindex, <!doctype>, <blockquote>, <article>, <big>, <#document>, <#end-of-file>. I suggest to result nil, same as undefined of javascript.

kostya commented 3 years ago

fixed in 2.6.7