Open marcselman opened 9 years ago
It's a bug. Three was unwanted nulls after first package from webserver if it's size been less than 4096.
Little messy test, that illustrate this bug.
using CsQuery.HtmlParser;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using NUnit.Framework;
using System;
using System.IO;
using System.Text;
using Assert = NUnit.Framework.Assert;
namespace CsQuery.Tests.Issues
{
[TestFixture, TestClass]
public class Issue187 : CsQueryTest
{
[Test, TestMethod]
public void Issue187Test()
{
using (var mockStream = new Issue187MockStream())
{
var factory = new ElementFactory();
var dom = factory.Parse(mockStream, Encoding.UTF8);
Assert.AreEqual(Issue187MockStream.HTML, dom.FirstChild.OuterHTML);
}
}
}
public class Issue187MockStream : Stream
{
public const string HTML = @"<html><head></head><body><a href=""http://test.example.com"">Test</a></body></html>";
public override int Read(byte[] buffer, int offset, int count)
{
byte[] bytes = Encoding.UTF8.GetBytes(HTML);
int splitPosition = bytes.Length / 2;
int lenght;
if (Position == 0)
{
lenght = splitPosition;
Array.Copy(bytes, buffer, splitPosition);
}
else if (Position == splitPosition)
{
lenght = bytes.Length - splitPosition;
Array.Copy(bytes, splitPosition, buffer, 0, lenght);
}
else
{
lenght = 0;
}
Position += lenght;
return lenght;
}
public override bool CanRead { get { return true; } }
public override bool CanSeek { get { return false; } }
public override bool CanWrite { get { return false; } }
public override long Position { get; set; }
public override void Flush() { return; }
public override long Length { get { throw new NotImplementedException(); } }
public override long Seek(long offset, SeekOrigin origin) { throw new NotImplementedException(); }
public override void SetLength(long value) { throw new NotImplementedException(); }
public override void Write(byte[] buffer, int offset, int count) { throw new NotImplementedException(); }
}
}
Hi,
I noticed some weird characters popping up in the HTML when using
CQ.CreateFromUrl
. Here is an example:When you execute above example (in LinqPad for example) you'll notice in the output:
I have no idea where the weird characters come from. I don't see them in the HTML source when loading it in the browser or in Sublime Text. If I load the page in c# into a string and then load the string into a CQ object it works without problems.
Do you have any idea what this could be? Thanks.