taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

The greater than sign of the attribute value on the tag is treated as a closing tag #51

Closed piter902 closed 3 years ago

piter902 commented 4 years ago

`

`

taoqf commented 4 years ago

I'm so sorry I could not fix this. the regexp is here

taoqf commented 4 years ago

pr is welcomed.

apttx commented 4 years ago

Having the same issue but i'm not very good with regex :( Does this help?

taoqf commented 4 years ago

@apttx 对不起我英语不好,我猜测你在使用vue。我非常好奇的是:如果你在使用vue,为什么又会用到这个库?在我看来,它们似乎不是一个门派的。我之前的回复也没说明白,这是一个正则的问题,但这个问题又不是一个简单的正则语法所能够解决的。如果我将解析过程改写,首要面对的一个问题就是效率,我无法保证如果解析正确之后效率是否会有非常大的下降。之所以这个库的处理速度快,其实不是因为它的算法有多高明,而是它漏掉了一些多数情况用不到的处理过程。所以如非是非常严重的错误(或是可以绕开的问题),我不建议修改这个库。在我维护这个库的时候,在多个版本之后,目前的版本已经比之前慢不少了。如果真是需要处理复杂情形,建议使用别的类似的库来做,虽然会牺牲一些效率,但也许它会解决你的很多问题。 对你提的问题我表示再次感谢!!!

apttx commented 4 years ago

@taoqf No worries mate. I understand that efficiency is a concern, but if something isn't parsed correctly, then the parser is entirely useless. It's possible to bypass this problem in certain situations, though. I'm not using vue, just regular old JS to scrape CSS properties off w3c and MDN. I'll have a look for other parser libraries, this one was just the first one on npm with a fairly large user base. I'm interested in HTML parsing, so I might come back to this and try to help out where I can :)

JounQin commented 3 years ago

个人感觉正确性优先于性能,或者可以提供两个 dist,一个坚持性能优先,一个增强鲁棒性?

danyan commented 3 years ago

<view class="col-3 stock-rise" :class="[item.zdf > 0 ? 'red' : item.zdf < 0 ? 'green' : '']"> 这个用例还是识别失败

taoqf commented 3 years ago

对不起这个大于小于号这种我现在修改不了,我觉得应该要用到正则的平衡组,但我之前尝试过,这个正则我写不出来,哈哈.欢迎pr.