msva / lua-htmlparser

An HTML parser for lua.
234 stars 44 forks source link

Wrap a text node in to current or parent node #49

Closed raylua2566 closed 3 years ago

raylua2566 commented 7 years ago

Feature: Wrap a text node in to current or parent node 新特征: 封装一个纯文本到一个当前或者父节点里 TODO:  ... et. not handle yet, create a ElementNode function to handle them? TODO: 一些特殊字符还没有处理, 考虑创建一个ElementNode实例方法处理特殊字符?

msva commented 7 years ago

Could you, please: 1) describe the use-case for this? 2) provide a test code, that will show this usecase (and so fail on current master, but will work after your PR)? 3) strip Chinese comments (keeping only english ones)?

Well, 3rd one is pretty cosmetic and have no consequences on how library works, but I'm asking about first two, because I am missing the end purpose (real-world example, when it would be useful) of that changes.

raylua2566 commented 7 years ago

Please forgive my grammar mistakes.

From README.md

Limitations

  • Textnodes are no separate tree elements; in local root = htmlparser.parse("<p>line1<br />line2</p>"), root.nodes[1]:getcontent() is "line1<br />line2", while root.nodes[1].nodes[1].name is "br"

Now

  • My PR will wrap the Text line1 and line2 into a node named text, after that while root.nodes[1].nodes[1].name is "text", and #root.nodes[1].nodes is 3

real-world example, when it would be useful

Why test case failed?

  • Test case file "tst/init.lua" that function test_order() case will not true at line 292, because the text 1 performance for <text>1</text> implicitly, the same as texts 2 3 ...10. so the :not(n)'s result is 14 instead of 4 Discussion
  • the origin str contains <text>some text</text> ^_^
msva commented 5 years ago

Ping?

Sorry for long disappearing, I was having a lot of personal issues :-/

Let's discuss further implementation of that idea?

msva commented 3 years ago

I'll close it for now, since it is incomptible with current code base now.

If you (or someby else) want to continue work on that - feel free to open new PR.