Describe the bug
HtmlKit manipulates the attribute value when they contain HTML special entities!
We expect when an attribute value is returned it should literarily be equal to the input stream!
Platform (please complete the following information):
OS: Windows & Linux
.NET Framework: .Net 6.0 & .NET Framework 4.8
HtmlKit Version: 1.1.0
To Reproduce
Steps to reproduce the behavior:
Create a new Console application
Add the following HTML file in the project and mark it as Copy if newer:
using HtmlKit;
namespace HtmlKitTestProject
{
internal class Program
{
static void Main(string[] args)
{
using var stream = new FileStream("index.html", FileMode.Open, FileAccess.Read);
using var reader = new StreamReader(stream);
var tokenizer = new HtmlTokenizer(reader);
HtmlToken token;
while (tokenizer.ReadNextToken(out token))
{
switch (token.Kind)
{
case HtmlTokenKind.Tag:
var tag = (HtmlTagToken)token;
if (tag.Id != HtmlTagId.A)
continue;
foreach (var attribute in tag.Attributes)
{
if (attribute.Value != null)
Console.WriteLine(" {0}={1}", attribute.Name, $"{attribute.Value}");
else
Console.WriteLine(" {0}", attribute.Name);
}
break;
}
}
Console.ReadLine();
}
}
}
Run the project and check the output:
Expected behavior
The HTML file contains attributes with some HTML special entities as their values:
When an attribute value is returned it should literarily be equal to the input stream but, as you see it's converted to their decoded version!
Describe the bug HtmlKit manipulates the attribute value when they contain HTML special entities! We expect when an attribute value is returned it should literarily be equal to the input stream!
Platform (please complete the following information):
.Net 6.0
&.NET Framework 4.8
To Reproduce Steps to reproduce the behavior:
Add the following HTML file in the project and mark it as
Copy if newer
:Copy-Paste the following code in
Program.cs
file:Expected behavior The HTML file contains attributes with some HTML special entities as their values: When an attribute value is returned it should literarily be equal to the input stream but, as you see it's converted to their decoded version!