Better way to parse HTML?

rcoh / gradsearch

gradsearch (re:search) is a website to connect students with professors who study their research interests.

www.gradschoolsearch.org

5 stars 2 forks source link

Better way to parse HTML? #13

Open rambhask opened 12 years ago

rambhask commented 12 years ago

Was reading (http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html) that using Regular Expressions to parse HTML is not efficient. Perhaps using lxml or some sort of existing library?

rcoh commented 12 years ago

Very cool. We should use that.

On Tue, Aug 28, 2012 at 8:26 AM, rambhask notifications@github.com wrote:

Was reading ( http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html) that using Regular Expressions to parse HTML is not efficient. Perhaps using lxml or some sort of existing library?

— Reply to this email directly or view it on GitHubhttps://github.com/rcoh/gradsearch/issues/13.