i am getting 404 block - Githubissues

ihabpalamino commented 1 year ago

Describe the bug

How to reproduce

it was working fine but not anymore

Expected behaviour

my code from datetime import datetime

from flask import Flask, request, jsonify

import snscrape.modules.twitter as sntwitter import pandas as pd import json import re

app = Flask(name)

@app.route('/scrape-tweets2', methods=['POST']) def scrape_tweets(): Username = request.form.get('username') SINCE = request.form.get('since') UNTIL = request.form.get('until') PLATFORM_NAME = request.form.get('plateform')

if SINCE and UNTIL:
    since_date = datetime.strptime(SINCE, "%Y-%m-%d")
    until_date = datetime.strptime(UNTIL, "%Y-%m-%d")
    date_range = f" since:{since_date.strftime('%Y-%m-%d')} until:{until_date.strftime('%Y-%m-%d')}"
else:
    date_range = ""

scraper = sntwitter.TwitterSearchScraper(f"(from:{Username}){date_range}")
tweets = []
for i, tweet in enumerate(scraper.get_items()):
    if tweet.media is not None and any(mediatype == "video" for mediatype in tweet.media):
        view_count = tweet.viewCount
    else:
        view_count = "Not a video tweet"

    data = {
        "id_post": tweet.id,
        "Date": tweet.date.strftime("%Y-%m-%d"),
        "Heure": tweet.date.strftime("%H:%M:%S"),
        "content": tweet.content,
        "username": tweet.user.username,
        "likecount": tweet.likeCount,
        "shares": tweet.retweetCount,
        "comments": tweet.replyCount,
        "platformname": PLATFORM_NAME,
        "postUrl": tweet.url
    }
    tweets.append(data)
    if i > 800:
        break

tweet_df = pd.DataFrame(tweets, columns=["id_post", "Date", "Heure", "content", "username", "likecount", "shares",
                                         "comments", "platformname", "postUrl"])
tweet_df.to_csv('tweeter.csv', sep=";", encoding='utf-8', index=False)

tweet_json = tweet_df.to_json(orient='records', indent=4, force_ascii=False)

clean_insta_json = re.sub(r"[\x00-\x1F\x7F-\x9F]", "", tweet_json)
response = jsonify(json.loads(clean_insta_json))
response.headers['Content-Type'] = 'application/json'
return response

if name == 'main': app.run(debug=True)

Screenshots and recordings

No response

Operating system

Windows 11

Python version: output of `python3 --version`

3.9.13

snscrape version: output of `snscrape --version`

snscrape-0.6.2.20230321.dev39+gc3b216c

Scraper

TwitterSearchScrapper

How are you using snscrape?

Module (import snscrape.modules.something in Python code)

Backtrace

No response

Log output

No response

Dump of locals

No response

Additional context

No response

Elsayed91 commented 1 year ago

same, my code was working fine the last 2 weeks. Been using the development version

pip3 install --upgrade git+https://github.com/JustAnotherArchivist/snscrape.git

but as of today, no kind of change is allowing me to get past the 404 block.

@op take a look at https://github.com/JustAnotherArchivist/snscrape/issues/996

germain-cyber commented 1 year ago

Same here, one day I was using simple queries and the other I was getting blocked. However, you can use Twitter API to tackle this problem but you ill reach a limit of tweets you can scrape.

My thoughts is that the owner of Twitter stopped unlimited queries in order not to let AI improve over his social network. (Many articles I read are saying that)

mrzeynalli commented 1 year ago

Same here. Has anybody managed to come up with a solution?

sfkaplan commented 1 year ago

Had the same issue. Seems snscraper not working anymore on twitter

AgungPambudi commented 1 year ago

Had the same issue. Seems snscraper not working anymore on twitter

same here

JustAnotherArchivist / snscrape

i am getting 404 block #1003

Describe the bug

How to reproduce

Expected behaviour

Screenshots and recordings

Operating system

Python version: output of `python3 --version`

snscrape version: output of `snscrape --version`

Scraper

How are you using snscrape?

Backtrace

Log output

Dump of locals

Additional context

JustAnotherArchivist / snscrape

i am getting 404 block #1003

Describe the bug

How to reproduce

Expected behaviour

Screenshots and recordings

Operating system

Python version: output of python3 --version

snscrape version: output of snscrape --version

Scraper

How are you using snscrape?

Backtrace

Log output

Dump of locals

Additional context

Python version: output of `python3 --version`

snscrape version: output of `snscrape --version`