sethia4u / abot

Automatically exported from code.google.com/p/abot
Apache License 2.0
0 stars 0 forks source link

Consider using CsQuery as the parser #33

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Consider using CsQuery as the parser. It boasts speeds of x times faster than 
hap.

https://github.com/jamietre/CsQuery

Original issue reported on code.google.com by sjdir...@gmail.com on 19 Nov 2012 at 3:23

GoogleCodeExporter commented 9 years ago
Also consider Fizzler...

http://blog.outsharked.com/2012/06/csquery-performance-vs-fizzler.html

Original comment by sjdir...@gmail.com on 30 Nov 2012 at 3:25

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 31 Dec 2012 at 6:03

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 3 Feb 2013 at 8:37

GoogleCodeExporter commented 9 years ago
Added CsQuery object to CrawledPage along with a IHyperLinkParser impl for 
CsQuery named CsQueryHyperLinkParser.cs. New configuration 
ShouldLoadCsQueryForEachCrawledPage and 
ShouldLoadHtmlAgilityPackForEachCrawledPage can be set to true or false to 
avoid overhead of loading these expensive objects.

Original comment by sjdir...@gmail.com on 4 Feb 2013 at 10:28