serpapi / nokolexbor

High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
182 stars 4 forks source link

`<template>` tags in cloned node get messed up #6

Closed jaredcwhite closed 1 year ago

jaredcwhite commented 1 year ago

Thanks for the great library!

I discovered one issue which is that when cloning a node which contains one or more <template> tags in the tree, the template elements don't clone properly. The data is still there but there's more than one DocumentFragment and the element gets serialized to <template></template>. I came up with a workaround to create a new template element, pull the right document fragment out, and swap the bad element for the new one:

example item_node HTML:

<li :class="{ 'high-light': item.name == 'xyz' }">
  <blockquote>
    <span v-text="index"></span>: Item! <strong><span v-text="text"></span> <span v-html="item.name"></span></strong>
  </blockquote>
  <ul>
    <template v-for="(subitem, index2) in item.subitems" :key="[subitem.name, index2]">
      <li :class="{ 'high-light': bigCount(count) }"><span v-text="index"></span> <span v-text="index2"></span>: <span v-text="subitem.name"></span></li>
    </template>
  </ul>
</li>

trying to clone that node:

new_node = item_node.clone # any template elements will be "empty" now

# workaround:
new_node.css("template").each do |bad_tmpl|
  frag = bad_tmpl.children.last
  new_tmpl = item_node.document.create_element("template")
  bad_tmpl.attributes.each do |k, v|
    new_tmpl[k] = v
  end
  new_tmpl.children[0].children = frag
  bad_tmpl.swap(new_tmpl)
end

Environment

lexborisov commented 1 year ago

Hi @jaredcwhite

This is my mistake. I'll try to fix this soon.

Thanks for the report!

zyc9012 commented 1 year ago

Thank you @jaredcwhite. I have released 0.4.0 and this issue should be fixed.

jaredcwhite commented 1 year ago

Thank you @zyc9012! I've confirmed 0.4.0 fixes the issue and I could delete the workaround.