bisguzar / twitter-scraper

Scrape the Twitter Frontend API without authentication.
MIT License
3.88k stars 600 forks source link

requests.exceptions.SSLError: HTTPSConnectionPool(host='twitter.com', port=443) #135

Closed icmpnorequest closed 4 years ago

icmpnorequest commented 4 years ago

Hi team,

When scraping tweets, I met this issue. The code and error is as below:

from twitter_scraper import get_tweets

for tweet in get_tweets('realDonaldTrump', pages=5):
    print(tweet['text'])

Error:

requests.exceptions.SSLError: HTTPSConnectionPool(host='twitter.com', port=443): Max retries exceeded with url: /i/profiles/show/realDonaldTrump/timeline/tweets?include_available_features=1&include_entities=1&include_new_items_bar=true (Caused by SSLError(SSLError("bad handshake: SysCallError(54, 'ECONNRESET')")))

Any ideas to solve?

bisguzar commented 4 years ago

It's look like a connection problem, please report is it still raising error?

icmpnorequest commented 4 years ago

Hi @bisguzar,

Yes, I still meet the same error. Any ideas to solve?

bisguzar commented 4 years ago

It's all about connection. Might be twitter blocking you if you are trying to scrape a lot in short term.

server refuses your connection (you're sending too many requests from same ip address in short period of time) - djra (stakoverflow)

icmpnorequest commented 4 years ago

Hi @bisguzar,

I have tried to change IP address with IP proxy pool, but still doesn't work. Twitter uses browser fingerprinting to track users, I was wondering if the blocking of my connection is affected by the fingerprinting technique?

Do you think should I try more IP address to solve the connection error problem? Thanks in advance for your response.

bisguzar commented 4 years ago

I'm not competent about twitter's algorithms honestly. It's hard to say sthg for me. What do you think changing user-agent headers and try on new device with new, clean connection?

icmpnorequest commented 4 years ago

Hi @bisguzar ,

Still having the ConnectionError. However, I could download the json data from the interface manually and parse tweets from it. When using Requests, it failed. Maybe Twitter could detect requests created by Requests.

icmpnorequest commented 4 years ago

Finally fixed this issue by changing to another proxy. The 443 connection error happened for the usage of my Shadowsocks server.

bisguzar commented 4 years ago

I'm happy to hear that!

h4m5t commented 3 years ago

Finally fixed this issue by changing to another proxy. The 443 connection error happened for the usage of my Shadowsocks server.

Can you tell me how did you solve it? thanks!