Closed rpaczkow closed 13 years ago
Apologies for taking so long to look into this...
The problem is due to the amount if errors Tidy is encountering parsing your HTML. By default, if any errors are found (as opposed to warnings), Tidy will not produce any output, and will give up parsing after 6 errors.
To override these defaults, try this:
using (TidyManaged.Document tdoc = TidyManaged.Document.FromFile(@"my.html"))
{
tdoc.ShowWarnings = true;
tdoc.Quiet = true;
tdoc.MaximumErrors = int.MaxValue;
tdoc.ForceOutput = true;
tdoc.InputCharacterEncoding = TidyManaged.EncodingType.Utf8;
tdoc.OutputCharacterEncoding = TidyManaged.EncodingType.Utf8;
tdoc.OutputXhtml = true;
tdoc.CleanAndRepair();
String s = tdoc.Save();
if (File.Exists(@"fixed.html"))
File.Delete(@"fixed.html");
File.WriteAllText(@"fixed.html", s);
}
Hi! I am trying to fix html from address http://stooq.com/notowania/?kat=g2. After saving page to harddisk I use this code below to fix errors and save to disk and I get empty fixed.html file.
using (TidyManaged.Document tdoc = TidyManaged.Document.FromFile(@"my.html")) { tdoc.ShowWarnings = true; tdoc.Quiet = true; tdoc..OutputXhtml = true;