Closed sharathm89 closed 6 years ago
Hello @sharathm89 ,
Unfortunately the server return the following error:
{StatusCode: 500, ReasonPhrase: 'Internal Server Error', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
{
x-frame-options: DENY
X-UA-Compatible: IE=Edge
X-Iinfo: 8-41929732-41929787 SNNN RT(1523536424411 339) q(0 0 0 -1) r(1 1) U11
X-CDN: Incapsula
Transfer-Encoding: chunked
Cache-Control: private
Date: Thu, 12 Apr 2018 12:33:42 GMT
Server:
Content-Type: text/html; charset=utf-8
}}
However, the NonAsync
version work fine.
HtmlAgilityPack.HtmlDocument doc = null;
string url = "https://www.finedininglovers.com/recipes/appetizer/vegan-dishes-white-asparagus/";
HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
doc = web.Load(url);
var html = doc.DocumentNode.OuterHtml;
So you can use it meanwhile we investigate the issue.
Best Regards,
Jonathan
thanks @JonathanMagnan
Hello @sharathm89 ,
The v1.8.1 has been released.
You should no longer have the issue with the Async
method.
Best Regards,
Jonathan
@JonathanMagnan still the issue exists with latest v1.8.1 below is the code I tested. Url also mentioned.
Async throws An error occurred while sending the request.
Non Async throws The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF
It used to work earlier but I guess after version upgrade its failing.
class Program
{
const string url = "https://www.finedininglovers.com/recipes/appetizer/vegan-dishes-white-asparagus/";
static void Main(string[] args)
{
try
{
GetHtmlDocumentAsync().GetAwaiter().GetResult();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message); // An error occurred while sending the request.
}
try
{
GetHtmlDocument();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message); // The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF
}
Console.ReadLine();
}
async public static Task<HtmlDocument> GetHtmlDocumentAsync()
{
HtmlWeb web = new HtmlWeb();
return await web.LoadFromWebAsync(url);
}
public static HtmlDocument GetHtmlDocument()
{
HtmlWeb web = new HtmlWeb();
return web.Load(url);
}
}
Hello @sharathm89 ,
Thank you for the additional info.
We will continue to look at it.
Best Regards,
Jonathan
thanks @JonathanMagnan
Hello @sharathm89 ,
We tried your code but everything is working on our side ;(
Could you try it and let us know what we are missing?
Best Regards,
Jonathan
@JonathanMagnan I tried the same code but sometimes it happens actually after reporting the issue I tried after 3 hours it worked but again 2 days back when I tried got same error. Now I tried its working...
So its not occurring every-time....
Hello @sharathm89 ,
That is probably due to some bot detection that BAN an ip that had made to many requests in a very short delay.
There is nothing we can do at this moment for such error ;(
Best Regards,
Jonathan
@JonathanMagnan Probably so in that case I'll close the issue.
Trying to scrape this Link but unable to do it..
It throws an exception with the message has
Error downloading Html
Tried setting
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7
as the UserAgent but still not working