DIYgod / RSSHub

🧡 Everything is RSSible
https://docs.rsshub.app
MIT License
33.33k stars 7.44k forks source link

Twitter routes no longer work #13049

Closed dzx-dzx closed 11 months ago

dzx-dzx commented 1 year ago

路由地址

/twitter/user/:id/:routeParams?

完整路由地址

/twitter/user/:id/:routeParams?

相关文档

https://docs.rsshub.app/en/routes/social-media#user-timeline-3

预期是什么?

返回用户时间线.

实际发生了什么?

演示站:

Route requested: /user/DIYgod

Error message: Response code 429 (Too Many Requests): target website might be blocking our access, you can host your own RSSHub instance for a better usability.

Helpful Information to provide when opening issue: Path: /user/DIYgod Node version: v18.17.1 Git Hash: 0d0ec74

自建站: Server Timeout

本地:

Route requested: /user/DIYgod

Error message: Response code 404 (Not Found): target website might be blocking our access, you can host your own RSSHub instance for a better usability.

Helpful Information to provide when opening issue: Path: /user/DIYgod Node version: v18.17.0 Git Hash: 115a334

部署

RSSHub 演示 (https://rsshub.app)

部署相关信息

No response

额外信息

.

这不是重复的 issue

github-actions[bot] commented 1 year ago
Searching for maintainers:

To maintainers: if you are not willing to be disturbed, list your username in scripts/workflow/test-issue/call-maintainer.js. In this way, your username will be wrapped in an inline code block when tagged so you will not be notified.

如果所有路由都无法匹配,issue 将会被自动关闭。如果 issue 和路由无关,请使用 NOROUTE 关键词,或者留下评论。我们会重新审核。 If all routes can not be found, the issue will be closed automatically. Please use NOROUTE for a route-irrelevant issue or leave a comment if it is a mistake.

Leo-czp commented 1 year ago

我从昨天开始就一直出现这个情况, 并且其他的路由也是404

Rongronggg9 commented 1 year ago

https://github.com/zedeus/nitter/issues/983

DIYgod commented 1 year ago

If the situation continues, we may consider adding account credentials, just like https://github.com/zedeus/nitter/pull/830

github-actions[bot] commented 1 year ago
Searching for maintainers:

To maintainers: if you are not willing to be disturbed, list your username in scripts/workflow/test-issue/call-maintainer.js. In this way, your username will be wrapped in an inline code block when tagged so you will not be notified.

如果所有路由都无法匹配,issue 将会被自动关闭。如果 issue 和路由无关,请使用 NOROUTE 关键词,或者留下评论。我们会重新审核。 If all routes can not be found, the issue will be closed automatically. Please use NOROUTE for a route-irrelevant issue or leave a comment if it is a mistake.

xianyu1124 commented 1 year ago

截止目前我依旧无法使用twitter rss,报错429

yzkoori commented 1 year ago

同样的问题,上周开始自建报错404,我还以为是我访问太频繁了

DIYgod commented 1 year ago

All known unofficial solutions to Twitter routes are currently not sustainable in the long term. https://t.me/rsshub/1/282607

Known official stable solution for long-term access is https://developer.twitter.com/en/portal/products/pro

If there is an API Token willing to be shared for public instance usage, we will arrange the fix as soon as possible. https://t.me/rsshub/282791/282826

2-3-5-7 commented 1 year ago

If the situation continues, we may consider adding account credentials, just like zedeus/nitter#830

能否不采用轮询的方式,加入 account credentials 后,开启关注 twitter 账户的 notifications,收到该账户的发推通知后,再用 rsshub 抓取,这样是否就可以避免频繁抓取导致封号了?

我开启 notifications 后,在 firefox 浏览器上都能很及时收到,不知道 rsshub 能否模拟浏览器来接收通知。

2-3-5-7 commented 1 year ago

用了 selenium,python 写大概就像下面这样,其中 turn_on_notification 是用 selenium 点击 Turn on notification 按钮参考

收到的通知里面就已经包含了全部内容,但 reply 和 likes 不会有通知。我是收到通知后,再用 twitter-api-client 这个来抓一下。

async def get_tweets():
    get_notifications = r"""
                    const callback = arguments[arguments.length - 1];
                    window.myServiceWorkerRegistration = await window.navigator.serviceWorker.getRegistration();
                    window.myNotifications = await window.myServiceWorkerRegistration.getNotifications();
                    window.myNotifications.forEach(noti => noti.close());
                    callback(window.myNotifications);
                    """
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_experimental_option("prefs", {"profile.default_content_setting_values.notifications": 1})
    chrome_options.add_argument("--headless=new")
    driver = webdriver.Chrome(options=chrome_options)
    driver.get("https://twitter.com")
    with open('cookies.json', 'r') as f:
        cookies = json.load(f)
        for cookie in cookies:
            # set the sameSite attribute to 'Strict' to avoid the error
            if 'sameSite' in cookie:
                cookie['sameSite'] = 'Strict'
            driver.add_cookie(cookie)
    turn_on_notification(driver)
    driver.set_script_timeout(5)
    while True:
        notis = driver.execute_async_script(get_notifications)
        # 一般只会收到一个 notification
        for n in notis:
            # 推广的 scribe_target 不是 tweet,且 tag 一般为空 ""
            if n['data']['scribe_target'] == 'tweet':
                yield {'poster': n['title'], 'guid': n['tag'], 'link': n['data']['uri'], 'full_text': n['body'],
                       'created_at': n['timestamp']}
                break

        await asyncio.sleep(2)
chrisyy2003 commented 1 year ago

still 404

untitaker commented 1 year ago

use one of the nitter instances at https://status.d420.de/ -- some of them have RSS

guest account branch in nitter is still WIP, but it works with some extra coding and infra

ghost commented 1 year ago

或者用 #13381

munierujp commented 1 year ago

I use IFTTT and Slack as an alternative.

スクリーンショット 2023-10-03 21 27 35
koszzz commented 1 year ago

用了 selenium,python 写大概就像下面这样,其中 turn_on_notification 是用 selenium 点击 Turn on notification 按钮参考

收到的通知里面就已经包含了全部内容,但 reply 和 likes 不会有通知。我是收到通知后,再用 twitter-api-client 这个来抓一下。

async def get_tweets():
    get_notifications = r"""
                    const callback = arguments[arguments.length - 1];
                    window.myServiceWorkerRegistration = await window.navigator.serviceWorker.getRegistration();
                    window.myNotifications = await window.myServiceWorkerRegistration.getNotifications();
                    window.myNotifications.forEach(noti => noti.close());
                    callback(window.myNotifications);
                    """
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_experimental_option("prefs", {"profile.default_content_setting_values.notifications": 1})
    chrome_options.add_argument("--headless=new")
    driver = webdriver.Chrome(options=chrome_options)
    driver.get("https://twitter.com")
    with open('cookies.json', 'r') as f:
        cookies = json.load(f)
        for cookie in cookies:
            # set the sameSite attribute to 'Strict' to avoid the error
            if 'sameSite' in cookie:
                cookie['sameSite'] = 'Strict'
            driver.add_cookie(cookie)
    turn_on_notification(driver)
    driver.set_script_timeout(5)
    while True:
        notis = driver.execute_async_script(get_notifications)
        # 一般只会收到一个 notification
        for n in notis:
            # 推广的 scribe_target 不是 tweet,且 tag 一般为空 ""
            if n['data']['scribe_target'] == 'tweet':
                yield {'poster': n['title'], 'guid': n['tag'], 'link': n['data']['uri'], 'full_text': n['body'],
                       'created_at': n['timestamp']}
                break

        await asyncio.sleep(2)

似乎是很好的方案!我卡在了turn_on_notification()上。可以的话能发一下完整代码吗?谢谢你。

2-3-5-7 commented 1 year ago
def turn_on_notification(driver):
    driver.get("https://twitter.com/settings/push_notifications")
    element = None
    for i in range(1, 1000):
        try:
            # Turn on 按钮
            element = WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.XPATH, r"//span[text()='Turn on']"))
            )
        except TimeoutException:
            driver.refresh()
            print(f'[{i}] 刷新等待 push_notifications')
        else:
            print('成功开启 push_notifications')
            break
    if element is None:
        print('错误:开启 push_notifications 超时,请检查 Cookie')
        exit(1)
    element.click()

@koszzz 在我的环境中,windows 开启会比较慢,有可能要等两分钟才能刷新出来,然而 linux 却非常快,几乎不用等待

koszzz commented 1 year ago
def turn_on_notification(driver):
    driver.get("https://twitter.com/settings/push_notifications")
    element = None
    for i in range(1, 1000):
        try:
            # Turn on 按钮
            element = WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.XPATH, r"//span[text()='Turn on']"))
            )
        except TimeoutException:
            driver.refresh()
            print(f'[{i}] 刷新等待 push_notifications')
        else:
            print('成功开启 push_notifications')
            break
    if element is None:
        print('错误:开启 push_notifications 超时,请检查 Cookie')
        exit(1)
    element.click()

@koszzz 在我的环境中,windows 开启会比较慢,有可能要等两分钟才能刷新出来,然而 linux 却非常快,几乎不用等待

您好!

import asyncio
import json
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException

def turn_on_notification(driver):
    driver.get("https://twitter.com/settings/push_notifications")
    element = None
    for i in range(1, 1000):
        try:
            # Turn on 按钮
            element = WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.XPATH, r"//span[text()='Turn on']"))
            )
        except TimeoutException:
            driver.refresh()
            print(f'[{i}] 刷新等待 push_notifications')
        else:
            print('成功开启 push_notifications')
            break
    if element is None:
        print('错误:开启 push_notifications 超时,请检查 Cookie')
        exit(1)
    element.click()

async def get_tweets():
    print('running')
    get_notifications = r"""
                    const callback = arguments[arguments.length - 1];
                    window.myServiceWorkerRegistration = await window.navigator.serviceWorker.getRegistration();
                    window.myNotifications = await window.myServiceWorkerRegistration.getNotifications();
                    window.myNotifications.forEach(noti => noti.close());
                    callback(window.myNotifications);
                    """
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_experimental_option("prefs", {"profile.default_content_setting_values.notifications": 1})
    chrome_options.add_argument("--headless=new")
    driver = webdriver.Chrome(options=chrome_options)
    driver.get("https://twitter.com")
    with open('cookies.json', 'r') as f:
        cookies = json.load(f)
        for cookie in cookies:
            # set the sameSite attribute to 'Strict' to avoid the error
            if 'sameSite' in cookie:
                cookie['sameSite'] = 'Strict'
            driver.add_cookie(cookie)
    turn_on_notification(driver)
    driver.set_script_timeout(5)
    while True:
        notis = driver.execute_async_script(get_notifications)
        # 一般只会收到一个 notification
        for n in notis:
            # 推广的 scribe_target 不是 tweet,且 tag 一般为空 ""
            if n['data']['scribe_target'] == 'tweet':
                print({'poster': n['title'], 'guid': n['tag'], 'link': n['data']['uri'], 'full_text': n['body'],
                       'created_at': n['timestamp']})
                yield {'poster': n['title'], 'guid': n['tag'], 'link': n['data']['uri'], 'full_text': n['body'],
                       'created_at': n['timestamp']}
                break

        await asyncio.sleep(2)

async def main():
    async for tweet in get_tweets():
        print(tweet)

# 使用asyncio.run()函数来运行main协程
asyncio.run(main())

这是我的代码,但出现问题。在windows下一直处于刷新等待 push_notifications,而在linux下

root@iZj6cdwmz8vygnraxyrk3mZ:/www/wwwroot/xwebsocket# python3 app.py
running
Traceback (most recent call last):
  File "app.py", line 72, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "app.py", line 68, in main
    async for tweet in get_tweets():
  File "app.py", line 42, in get_tweets
    driver = webdriver.Chrome(options=chrome_options)
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/chromium/webdriver.py", line 56, in __init__
    super().__init__(
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webdriver.py", line 205, in __init__
    self.start_session(capabilities)
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webdriver.py", line 289, in start_session
    response = self.execute(Command.NEW_SESSION, caps)["value"]
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webdriver.py", line 344, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: Chrome failed to start: exited normally.
  (session not created: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x555c77fe4fb3 <unknown>
#1 0x555c77cb84a7 <unknown>
#2 0x555c77cebc93 <unknown>
#3 0x555c77ce810c <unknown>
#4 0x555c77d2aac6 <unknown>
#5 0x555c77d21713 <unknown>
#6 0x555c77cf418b <unknown>
#7 0x555c77cf4f7e <unknown>
#8 0x555c77faa8d8 <unknown>
#9 0x555c77fae800 <unknown>
#10 0x555c77fb8cfc <unknown>
#11 0x555c77faf418 <unknown>
#12 0x555c77f7c42f <unknown>
#13 0x555c77fd34e8 <unknown>
#14 0x555c77fd36b4 <unknown>
#15 0x555c77fe4143 <unknown>
#16 0x7f635180b609 start_thread

root@iZj6cdwmz8vygnraxyrk3mZ:/www/wwwroot/xwebsocket#

我确保了我的chrome和chrome-driver安装好了,且版本都是最新版。 我不清楚出现了什么问题,所以可以的话请把你的.py文件全部内容发给我,谢谢。 如果能给我改进意见那真的谢谢你。 @2-3-5-7

2-3-5-7 commented 1 year ago

程序调试就只能你自己做了,说实话 turn_on_notification 那个函数代码都需要的话,我就不太建议你自己写了,就放弃吧。 我的代码有很多我自己的东西,没法发给你。

TonyRL commented 11 months ago

Closed in 9032153c7de4c9ca189482d495696986e9795106

CXwudi commented 11 months ago

Wowowow, that's big news. I wasn't expecting this issue can be solved within this year. I haven't tried it out yet, but again, very appreciated for maintainers of this wonderful project

kpg-anon commented 11 months ago

Closed in 9032153

For anyone else wondering how to implement this first run this script

#!/bin/bash

guest_token=$(curl -s -XPOST https://api.twitter.com/1.1/guest/activate.json -H 'Authorization: Bearer AAAAAAAAAAAAAAAAAAAAAFXzAwAAAAAAMHCxpeSDG1gLNLghVe8d74hl6k4%3DRUMF4xAQLsbeBhTSRrCiQpJtxoGWeyHrDb5te2jpGskWDFW82F' | jq -r '.guest_token')

flow_token=$(curl -s -XPOST 'https://api.twitter.com/1.1/onboarding/task.json?flow_name=welcome' \
          -H 'Authorization: Bearer AAAAAAAAAAAAAAAAAAAAAFXzAwAAAAAAMHCxpeSDG1gLNLghVe8d74hl6k4%3DRUMF4xAQLsbeBhTSRrCiQpJtxoGWeyHrDb5te2jpGskWDFW82F' \
          -H 'Content-Type: application/json' \
          -H "User-Agent: TwitterAndroid/10.10.0" \
          -H "X-Guest-Token: ${guest_token}" \
          -d '{"flow_token":null,"input_flow_data":{"flow_context":{"start_location":{"location":"splash_screen"}}}}' | jq -r .flow_token)

curl -s -XPOST 'https://api.twitter.com/1.1/onboarding/task.json' \
          -H 'Authorization: Bearer AAAAAAAAAAAAAAAAAAAAAFXzAwAAAAAAMHCxpeSDG1gLNLghVe8d74hl6k4%3DRUMF4xAQLsbeBhTSRrCiQpJtxoGWeyHrDb5te2jpGskWDFW82F' \
          -H 'Content-Type: application/json' \
          -H "User-Agent: TwitterAndroid/10.10.0" \
          -H "X-Guest-Token: ${guest_token}" \
          -d "{\"flow_token\":\"${flow_token}\",\"subtask_inputs\":[{\"open_link\":{\"link\":\"next_link\"},\"subtask_id\":\"NextTaskOpenLink\"}]}" | jq -c -r '.subtasks[0]|if(.open_account) then {oauth_token: .open_account.oauth_token, oauth_token_secret: .open_account.oauth_token_secret} else empty end'

it will output a guest token and token secret like this

{"oauth_token":"1719213587296620928-BsXY2RIJEw7fjxoNwbBemgjJhueK0m","oauth_token_secret":"N0WB0xhL4ng6WTN44aZO82SUJjz7ssI3hHez2CUhTiYqy"}

then add this to your .env file

TWITTER_OAUTH_TOKEN=1719213587296620928-BsXY2RIJEw7fjxoNwbBemgjJhueK0m
TWITTER_OAUTH_TOKEN_SECRET=N0WB0xhL4ng6WTN44aZO82SUJjz7ssI3hHez2CUhTiYqy

and restart RSSHub 👍

CXwudi commented 11 months ago

Ahh, I see, so similar idea as of nitter

For everyone, it is better to automate the script with cron to consistently refresh the guest token as the guest token can not last long (approximately 1 month?)

ghost commented 11 months ago

Can the script automatically run itself every time the TWITTER_OAUTH_TOKEN fails?

DIYgod commented 11 months ago

No, it cannot be obtained automatically. Twitter has very strict restrictions on obtaining this token. You can refer to this wiki and this issue for details.

CXwudi commented 11 months ago

Sorry for the confusion, yes bulk creation of guest account (using one IP) is also not possible

dtlnor commented 11 months ago

I have set these parameter in .env

CACHE_EXPIRE=300
CACHE_CONTENT_EXPIRE=3600

I've subscribed 4 twitter user timeline only, but still hitting the rate limit (HTTPError: Response code 429 (Too Many Requests)). I guess its because of that "rapid requests"? Is there anyway to solve it?