SuperCh-SE-NCSU / ProjectScraping

web-crawling in Python
6 stars 6 forks source link

Published papers #68

Closed dragonfly90 closed 9 years ago

dragonfly90 commented 9 years ago

A famous book to describe python script like regular expression: Langtangen, Hans Petter. Python scripting for computational science. Vol. 3. Berlin, Heidelberg and New York: Springer, 2006.

alannsp commented 9 years ago

Thelwall, Mike. "A web crawler design for data mining." Journal of Information Science 27.5 (2001): 319-325. The processing of the text of web pages in order to extract information can be expensive in terms of processor time. Consequently a distributed design is proposed in order to effectively use idle computing resources and to help information scientists avoid the need to employ dedicated equipment.

alannsp commented 9 years ago

Castillo, Carlos. "Effective web crawling." ACM SIGIR Forum. Vol. 39. No. 1. ACM, 2005. This thesis studies Web crawling at several different levels, ranging from the long-term goal of crawling important pages first, to the short-term goal of using the network connectivity efficiently, including implementation issues that are essential for crawling in practice.

dragonfly90 commented 9 years ago

Grehan, Rick. "Pillars of Python: Web. py Web framework." InfoWorld IDG Retrieved January (2013). Compare six web frameworks like web.py and Django

alannsp commented 9 years ago

add those in the readme