logic-language / bitinfochartscraper

Scrapes bitinfocharts data into csv files
16 stars 16 forks source link

tips on not getting banned #1

Open tzekid opened 3 years ago

tzekid commented 3 years ago

Use request's sessions to avoid getting your IP banned. The website mostly looks for a valid cookie in the request headers.
See this stackoverflow topic for example.

Also use timeouts between requests (even if this will make stuff take a bit longer, it's better than getting banned.)
You could add it to the synchronous part of the script – it's slow anyways.

e.g.:

import time
import random
time.sleep(random.randint(0, 3))
chandrashan commented 3 years ago

Great advice. I've added a variable delay now between 0 and 1s and updated the code for the synchronous code

Will do a little research on how to use sessions with async code, as I'm still learning that, but from what I see I am using sessions there - which I thought persists the cookies anyway?

klaus-duan commented 1 month ago

excuse me. Do you have new scraper code for bitinfocharts.com? I really need it. I will pay for it.

https://bitinfocharts.com/comparison/top100cap-price-btc-mom7.html#alltime