derat / nitter-rss-proxy

Moved to codeberg.org/derat/nitter-rss-proxy
https://codeberg.org/derat/nitter-rss-proxy
BSD 3-Clause "New" or "Revised" License
11 stars 3 forks source link

Broken /i/web/ URLs #5

Closed derat closed 1 year ago

derat commented 1 year ago

It looks like Twitter sometimes uses /i/web/status/<id> URLs that get rewritten incorrectly.

When I look at a retweet of https://twitter.com/_x_takes_/status/1612900253677621251 in Feedly, it contains the broken link https://nitter.net/twitter.com/web/status/1612900253677621251 and the first image from the original tweet.

The original tweet looks like this in https://nitter.mask.sh/_x_takes_/rss:

    <item>
      <title>twitter.com/i/web/status/161…</title>
      <dc:creator>@_x_takes_</dc:creator>
      <description><![CDATA[<p><a href="https://nitter.net/i/web/status/1612900253677621251">nitter.net/i/web/status/161…</a></p>
<img src="https://nitter.mask.sh/pic/media%2FFmIsDwJWQAAJWl8.jpg" style="max-width:250px;" />]]></description>
      <pubDate>Tue, 10 Jan 2023 19:52:41 GMT</pubDate>
      <guid>https://nitter.mask.sh/_x_takes_/status/1612900253677621251#m</guid>
      <link>https://nitter.mask.sh/_x_takes_/status/1612900253677621251#m</link>
    </item>

It seems like some Nitter instances use their own hostnames in these URLs rather than nitter.net, but the URL structure is otherwise the same.

Here's what it looks like after the proxy converts it to a JSON feed (this is from a different Nitter instance that uses its own hostname):

    {
      "id": "https://twitter.com/_x_takes_/status/1612900253677621251",
      "url": "https://twitter.com/_x_takes_/status/1612900253677621251",
      "title": "twitter.com/i/web/status/161…",
      "content_html": "\u003cp\u003e\u003ca href=\"https://nitter.bird.froth.zone/twitter.com/web/status/1612900253677621251\"\u003enitter.bird.froth.zone/twitter.com/web/status/161…\u003c/a\u003e\u003c/p\u003e\u003cbr\u003e\u003cimg src=\"https://pbs.twimg.com/media/FmIsDwJWQAAJWl8?format=jpg\" style=\"max-width:250px;\" /\u003e",
      "summary": "twitter.com/i/web/status/161…",
      "date_published": "2023-01-10T19:52:41Z",
      "author": {
        "name": "@_x_takes_"
      }
    },

Nitter seems like it may not handle Twitter's weird /i/web/ URLs properly: https://nitter.net/i/web/status/1612900253677621251 returns an error, while https://twitter.com/i/web/status/1612900253677621251 seems to produce the same content as https://twitter.com/_x_takes_/status/1612900253677621251.

derat commented 1 year ago

I take that back; https://nitter.net/i/web/status/1612900253677621251 is functional, but it returns a "Tweet not found" error about half the time that I try to load it. :-/