zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.63k stars 375 forks source link

Can not validate p tag is closed certainly. #210

Closed Aldin-Xia closed 6 years ago

Aldin-Xia commented 6 years ago

Hi, @JonathanMagnan Here is an issue with me that cased when I use HAP: HAP 1.8.4

request.CustomCode = "<div>root div 1<div>child div<p> 1-1</div></div><div>root div 2</div>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(request.CustomCode);
HtmlDocument.DisableBehavaiorTagP = true;
var htmlNodes = htmlDoc.DocumentNode.Descendants().Where(s => s.NodeType == HtmlNodeType.Element);
foreach (var node in htmlNodes)
{
    if (!node.Closed)
    {
        // do something
        break;
    }
}

Look at the if (!node.Closed) block, when node is "p", expected that is Closed == false, but true. The p tag does not closed explicitly in the original html partial.

When I print the node.InnerHtml, the value is root div 1<div>child div<p> 1-1</div>.

So, that makes me confused.

THANKS

JonathanMagnan commented 6 years ago

Hello @Aldin-Xia ,

Thank you for reporting, we will look at it as soon as we can (We are currently at the end of a big development)

Best Regards,

Jonathan


Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework ExtensionsBulk OperationsDapper PlusLinqToSql Plus

Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval FunctionSQL Eval Function

Aldin-Xia commented 6 years ago

Thank you for replying to me during your busy schedule. @JonathanMagnan I'll trying to solve it with other method. If I have any effective way, I will be happy to share it.

Best Regards