RSS-Bridge / rss-bridge

The RSS feed for websites missing it
https://rss-bridge.org/bridge01/
The Unlicense
7.23k stars 1.03k forks source link

[InstagramEmbedBridge] Add a new Instagram Bridge based on the embed url #4060

Open sysadminstory opened 5 months ago

sysadminstory commented 5 months ago

This bridge is a simplified version of InstagramBridge that is based on the embed page of a profile page.

It does only support the Username mode, and is limited to the last 6 medias.

At least actually, it does not need any cookies, and does work from a server IP, without limitation.

github-actions[bot] commented 5 months ago

Pull request artifacts

Bridge Context Status
InstagramEmbed 1 Username (pr) ✔️

last change: Thursday 2024-04-04 21:40:30

Mynacol commented 5 months ago

Nice job. The feed works fine, but I have some issues with the media links.

The embedded images/videos aren't displayed, as Instagram sends the cross-origin-resource-policy: same-origin header, which all browsers lead to not allow this request to finish. Opening the non-direct media link (ending with /media?size=l) also doesn't work, presenting a "page not found" error. Accessing the URL from a new window leads to the desired image. The reason clicking on the link directly doesn't work seems to be the Sec-Fetch-Site: cross-site header, which tells Instagram this is a request originating from another origin. Instagram then returns a HTTP 404 error.

Finally, the direct media links work, but produce a "URL signature mismatch" sometimes/after some time, which is unfortunately expected.

dvikan commented 5 months ago

I would prefer merging this change into the existing instagram bridge so that current feeds start working again.

if it's too much work, im ok with merging this.

i am guessing that also the embed has some form of rate limits though...

Bockiii commented 5 months ago

Fully agree with you dvikan. Instagram has always been one of the main problem-bridges. If this can at least restore parts of the functionality, it will probably help 99% of the users of the old bridge and all of the instagram issues can be closed

sysadminstory commented 5 months ago

I can try to merge this change to the existing Instagram bridge, but I have some question about the merge :

  1. Should I add a new specific context to create a "use embed instagram" feed ?
  2. Should I use the "embed instagram" method as a fallback when direct access to instagram is not possible ?
  3. Should I add a parameter to the existing contaxt ?

In the case of question 2, how should the Feed reader be informed that the method used can not display some media types (stories / reels / whatever they call them) ? In case of question 3, how should I inform the user that choosing the "use embed instagram as source" can lead to a limitation of the media he will get ?

sysadminstory commented 4 months ago

@Bockiii @dvikan What's your opinion ? :)

Mynacol commented 4 months ago

I'd opt for option 2. Use it as a fallback/alternative for the existing methods. That makes it usable for existing feeds and avoids exposing new options users might not be able to grasp. Regarding informing users about reduced media formats: I'd just add some text in the bridge description or info boxes, but never into the feed. The exact design is still up for debate.

Bockiii commented 4 months ago

In regards to techincal debt, I always prefer to just do one thing one time. The fallback option has the benefit that the users wont have to change anything, but adds a new layer to a already complex bridge which makes it harder to maintain or grasp for possible maintainers.

If you can manage to make it a sensible failover, go for option 2 I guess. I dont think an extra bridge or marker would benefit anyone.

TReKiE commented 2 weeks ago

I've had this on my instance for a while working without issue, but it broke today. It's easily fixed by editing the regex on line 53: https://github.com/RSS-Bridge/rss-bridge/blob/d4618e8a991945b6c50f4f2b46f5551d8451129c/bridges/InstagramEmbedBridge.php#L53

For some reason, there's some weird unicode being added after NavigationMetrics and before ", which breaks it. The simple fix seems to be to remove the trailing ", as it will still match and parse correctly. i.e. $regex = '#"contextJSON":"(.*)"}\]\],\["NavigationMetrics#m';