Mondego crawler4py issues - Githubissues

Mondego / crawler4py

A web crawler in Python

21 stars 17 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Adding prints for better dubugging and adding comments. Making output writing independent of link extraction.

#12 deyarchit closed 8 years ago
4
What's the designed stop situation ? & Question about the shelve

#11 Chin-I opened 8 years ago
1
feature: ability to configure custom persistence object

#10 kfatehi closed 8 years ago
1
Catch more thread exceptions

#9 kfatehi closed 8 years ago
3
Fixing the validation for UserAgentString

#8 deyarchit closed 8 years ago
1
Cannot Access "Persistent.shelve.db"

#7 ckhajavi closed 8 years ago
1
Added sql

#6 john-ko closed 8 years ago
1
cast politeness to float to prevent zeroing out

#5 kfatehi closed 8 years ago
0
Crawler does not respect politeness across all threads

#4 kfatehi closed 8 years ago
7
Lower case extensions in parsed.path.lower() in sampleConfig.py

#3 john-ko closed 8 years ago
0
Declare roboturl so its absence doesn't throw later

#2 kfatehi closed 8 years ago
0
Giving options to remove JS and CSS content of the page to extract text

#1 neerajcse closed 9 years ago
1