zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.65k stars 375 forks source link

Modify a `#text` node name causes a `StackOverflowException` #548

Closed BrianPainter88 closed 6 months ago

BrianPainter88 commented 6 months ago

Here is what to include in your request to make sure we implement a solution as quickly as possible.

1. Description

My application is using a process in which it loops through certain nodes and removes any namespaces that may be appended to the name. After the upgrade to version 1.11.48, a StackOverflowException is encountered when updating the HtmlNode.Name property and attempting to access either the HtmlNode.InnerHtml or HtmlNode.InnerText properties.

2. Exception

Exception of type 'System.StackOverflowException' was thrown.

3. Fiddle or Project

https://dotnetfiddle.net/Q1LM01

using System;
using HtmlAgilityPack;

public class Program
{
    public static void Main()
    {
        var document = new HtmlDocument();
        document.LoadHtml("a");

        foreach (var node in document.DocumentNode.ChildNodes)
        {
            if (node.Name == "#text")
            {
                node.Name = node.Name;
            }
        }

        foreach (var node in document.DocumentNode.ChildNodes)
        {
            Console.WriteLine(node.InnerHtml);
        }
    }
}

4. Any further technical details

Add any relevant detail can help us, such as:

JonathanMagnan commented 6 months ago

Hello @BrianPainter88 ,

Thank you for reporting, the issue has been fixed for HtmlTextNode.

A new version should be released early this week.

Best Regards,

Jon

JonathanMagnan commented 6 months ago

Hello @BrianPainter88 ,

A new version has been released today.

Could you let me know if everything is now working with the latest version?

Best Regards,

Jon