Open mandx opened 12 years ago
That's great advice - I'll make that change today. Thx :)
I initially chose lxml for its speed (http://blog.dispatched.ch/2010/08/16/beautifulsoup-vs-lxml-performance/), but I can't see this app causing any major performance issues.
From what I can see, Beautiful Soup might be a better choice. Do you agree?
Use a HTML compliant parser: In some doctypes (including HTML5) the
meta
tags do not need to be closed (like `<meta ... />) but the XML parser fails to read this tags. Also, some HTML entities are not recognized, like » (»).