Closed marshonhuckleberry closed 4 years ago
is good to keep low profile when scraping also wanted to ask about adding delays after each downloaded link to avoid detection
Yes it checks for existence and downloads only if missing. You can force it to download a file everytime it sees a link by adding over_write=True
to global config .
Delays could be implemented easily by overridding the get
method of
pywebcopy.SESSION
.
will the program check if the file already exists or it will download it anyway and if it exists it will replace it? its very important thing because it affects scraping time, bandwidth resource usage and spider detection, some websites detect if you scrape them if you download same files again and again