Closed cedricbonhomme closed 9 years ago
Looking at BeautifulSoup documentation (http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser), they recommend lxml parser. And because lxml is already a dependency of pyaggr3g470r there's no reason to not used it.
Exemple in crawler.py, replace
#!python description = BeautifulSoup(description, "html.parser").decode()
by
#!python description = BeautifulSoup(description, "lxml").decode()
Using lxml parser instead of html.parser, fixes #4.
→ <<cset 67952f5b3358>>
Original comment by: Cédric Bonhomme
Looking at BeautifulSoup documentation (http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser), they recommend lxml parser. And because lxml is already a dependency of pyaggr3g470r there's no reason to not used it.
Exemple in crawler.py, replace
by