RipMeApp / ripme

Downloads albums in bulk
MIT License
3.71k stars 627 forks source link

Redgifs error #1994

Open Ihatetotank opened 2 years ago

Ihatetotank commented 2 years ago

Expected Behavior

Expected it to download the MP4

Actual Behavior

Detail the actual (incorrect) behavior here. You can post log snippets or attach log files to your issue report. Seems the query string may have changes, all Redgifs links error in the same way

CypherpunkSamurai commented 2 years ago

Hello @Ihatetotank, Can you please provide the post link instead of the redgifs link?

The redgifs link is broken

Ihatetotank commented 2 years ago

Hello @Ihatetotank, Can you please provide the post link instead of the redgifs link?

The redgifs link is broken

Try this: https://www.reddit.com/r/stripgirls/comments/x3jl6p/dripping_wet_for_you/

Links to https://redgifs.com/watch/meekidiotickatydid

RipMe logged the link as: https://thumbs4.redgifs.com/MeekIdioticKatydid-mobile.mp4?expires=1662076800&signature=6b0bd38c8f9433b4ea376ac90ed3a5de9c455a0052db33ed59da6aafcd1a1eca&for=xx.xx.xx.xx

Errors with: Query string parameter signature is missing.

Ihatetotank commented 2 years ago

Update: Viewing the source, it looks like the actual MP4 is this URL: https://thumbs4.redgifs.com/MeekIdioticKatydid-mobile.mp4?expires=1662075000&signature=18523514ffeddddbd9eef32c033ae9c0400fa795af0e20118afa22314a510ff1&for=xx.xx.xx.xx#t=0

Compared to the RipME URL: https://thumbs4.redgifs.com/MeekIdioticKatydid-mobile.mp4?expires=1662076800&signature=6b0bd38c8f9433b4ea376ac90ed3a5de9c455a0052db33ed59da6aafcd1a1eca&for=xx.xx.xx.xx

Obviously different signature and missing the #t=0

CypherpunkSamurai commented 2 years ago

Obviously different signature and missing the #t=0

I think it's mostly the &. Redgifs api requests the json using the gif id (meekidiotickatydid), with a valid bearer cookie.

https://api.redgifs.com/v2/gifs/meekidiotickatydid?views=yes&users=yes

the urls don't have t=0 or amp.

CypherpunkSamurai commented 2 years ago

the problem is the code is finding the contentUrl json value, which has a wrong signature. Also changing the CDN to thumbs3 instead of thumbs4 works for now. Also we can use v1 of the api to get valid signatures: https://api.redgifs.com/v1/gifs/meekidiotickatydid This v1 api is getting deprecated, as mentioned https://github.com/Redgifs/api/wiki/Requesting-media-links

I found out that they need you to register for Bearer tokens. The bearer token for their web app is just:

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiIxODIzYzMxZjdkMy03NDVhLTY1ODktMDAwNS1kOGU4ZmUwYTQ0YzIiLCJleHAiOjE2NjM2NzczMjEsInN1YiI6ImNsaWVudFwvMTgyM2MzMWY3ZDMtNzQ1YS02NTg5LTAwMDUtZDhlOGZlMGE0NGMyIiwic2NvcGVzIjoicmVhZCIsInJhdGUiOi0xfQ.uod5xbnhPbdlOb5UTdHPFls3ZtwxKOjkHtN2rpU8h6Y

which if the base64 of with .uod5xbnhPbdlOb5UTdHPFls3ZtwxKOjkHtN2rpU8h6Y appended to the end:

{"typ":"JWT","alg":"HS256"}{"iss":"1823c31f7d3-745a-6589-0005-d8e8fe0a44c2","exp":1663677321,"sub":"client\/1823c31f7d3-745a-6589-0005-d8e8fe0a44c2","scopes":"read","rate":-1}
Ihatetotank commented 2 years ago

So what does that mean for the future of ripping RedGifs with RipMe? Is there a way to edit it so it works with the new API?

CypherpunkSamurai commented 2 years ago

I'm new to this project, I'll try opening a PR to help fix this.

It will work, yes

GarethFreeman commented 2 years ago

I'm having the same problem. What's the solution?

CypherpunkSamurai commented 2 years ago

The Current CI's are failing, I don't thing the source code is in a working condition yet.

https://github.com/RipMeApp/ripme/actions

I'll open a PR once the source code is fixed

Edit: Or I could make a new project

GarethFreeman commented 2 years ago

Well, thanks. Keep me updated.

Zemur11 commented 2 years ago

There is some discussion on a solution here https://www.reddit.com/r/DataHoarder/comments/x67zfo/redgifs_api_now_blocked_from_hotlinking_anyone/

Ihatetotank commented 2 years ago

This is great info!

On Mon, Sep 12, 2022, 7:28 PM Zemur11 @.***> wrote:

There is some discussion on a solution here https://www.reddit.com/r/DataHoarder/comments/x67zfo/redgifs_api_now_blocked_from_hotlinking_anyone/

— Reply to this email directly, view it on GitHub https://github.com/RipMeApp/ripme/issues/1994#issuecomment-1244816633, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFKM43EY3QTBAB2QFWAO2HTV57REPANCNFSM6AAAAAAQCORWYQ . You are receiving this because you were mentioned.Message ID: @.***>

scryptio commented 1 year ago

@CypherpunkSamurai Try it at the fork instead, main is dead -> https://github.com/ripmeapp2/ripme

CypherpunkSamurai commented 1 year ago

I just checked and contrary to the reddit post, the token I used is still working. Try here

Also even if it won't work we know how to create tokens (as mentioned above), we can just create one on the fly without any account.

I think this would work. Will draft a PR soon cool 👍🏼

Ihatetotank commented 1 year ago

Thank you @CypherpunkSamurai !

CypherpunkSamurai commented 1 year ago

Correction. The token seems to be a JWT so we require a key to create jwt. Also the webui uses a client id, but i cant find the secret, so we are left to finding the token.

The token can be located in https://www.redgifs.com/assets/js/index.0a3f050b.js if you find eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9

AAndyProgram commented 1 year ago

@CypherpunkSamurai the token cannot be created! It is updated on every request!

Cookies are valid for a very short term. The bearer token refreshes every request.

which if the base64 of with .uod5xbnhPbdlOb5UTdHPFls3ZtwxKOjkHtN2rpU8h6Y appended to the end:

{"typ":"JWT","alg":"HS256"}{"iss":"1823c31f7d3-745a-6589-0005-d8e8fe0a44c2","exp":1663677321,"sub":"client\/1823c31f7d3-745a-6589-0005-d8e8fe0a44c2","scopes":"read","rate":-1}

Okay but where do you propose to find these credentials? This is the first. And secondly, the middle part of the token is also updated with each request!

In

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiIxODIzYzMxZjdkMy03NDVhLTY1ODktMDAwNS1kOGU4ZmUwYTQ0YzIiLCJleHAiOjE2NjM2NzczMjEsInN1YiI6ImNsaWVudFwvMTgyM2MzMWY3ZDMtNzQ1YS02NTg5LTAwMDUtZDhlOGZlMGE0NGMyIiwic2NvcGVzIjoicmVhZCIsInJhdGUiOi0xfQ.uod5xbnhPbdlOb5UTdHPFls3ZtwxKOjkHtN2rpU8h6Y

the middle part is eyJpc3MiOiIxODIzYzMxZjdkMy03NDVhLTY1ODktMDAwNS1kOGU4ZmUwYTQ0YzIiLCJleHAiOjE2NjM2NzczMjEsInN1YiI6ImNsaWVudFwvMTgyM2MzMWY3ZDMtNzQ1YS02NTg5LTAwMDUtZDhlOGZlMGE0NGMyIiwic2NvcGVzIjoicmVhZCIsInJhdGUiOi0xfQ

CypherpunkSamurai commented 1 year ago

the token cannot be created! It is updated on every request!

Yeah figured that out in the last comment, lol.

Okay but where do you propose to find these credentials?

when you load the redgifs webpage, right click and view-source. you will find a assets/js/index_____.js. (reply with the file name, just for check)

the middle part is The middle part of a jwt updates as its values contain timestamps (epoch time). We can create the middle portion, but the last portion is the hash, which is the verification for the whole token. Unless we can crack the token key using john or hashcat we won't be able to generate tokens

What is the solution you ask? We use regex to parse the token from the js file Everytime. I already wrote the token fetcher, I just don't have time to rewrite the whole RedGifs module. None of the code works after redgifs added changes, the whole module needs a rewrite.

AAndyProgram commented 1 year ago

you will find a assets/js/index_____.js. (reply with the file name, just for check)

I can't find the token in this file.

We use regex to parse the token from the js file Everytime.

Can you write this regex here?

I already wrote the token fetcher

Excuse me, but where? Maybe I missed something?

None of the code works after redgifs added changes, the whole module needs a rewrite.

I'm confused... So is there a solution or not?

CypherpunkSamurai commented 1 year ago

I was kinda busy, I can't complete rewriting the whole module. Here's the code, it needs a refactor bytheway:

public static void getToken() throws IOException {
        LOGGER.info("Fetching AUTH_TOKEN for " + HOST);

        //String i = "index.0a3f050b.js";
        String i = "";

        if (AUTH_TOKEN != "") {
            LOGGER.info("AUTH_TOKEN already exists. AUTH_TOKEN is " + AUTH_TOKEN);
            return;
        }
        // Fetches AUTH_TOKEN for future requests
        Document h = Http.url("https://" + HOST).ignoreContentType().get();
        Matcher hm = Pattern.compile(".*(index.*?js).*", Pattern.DOTALL).matcher(h.toString());
        if (hm.matches()) {
            i = hm.group(1);
        } else {
            return;
        }
        // Read Token
        Document j = Http.url("https://www." + HOST + "/assets/js/" + i)
                .header("User-Agent", "Mozilla (Webkit)")
                .header("Referer", "https://" + HOST)
                .ignoreContentType()
                .get();
        Matcher jm = Pattern.compile(".*(eyJ0.*?\").*", Pattern.DOTALL).matcher(j.toString());
        if (jm.matches()) {
            AUTH_TOKEN = jm.group(1).replace("\"", "");
            LOGGER.info("Authtoken: " + AUTH_TOKEN);
        } else {
            return;
        }
    }
AAndyProgram commented 1 year ago

Thanks for your time. I also need time to test your code. I don't know java, so sorry for stupid questions.

Document h and Document j is HtmlDocument or what?

If the token is refreshed on every request, then the request to get the token also cause to the token to be refreshed?

AAndyProgram commented 1 year ago

Actually I really doubt that this bearer token is a required credential (at least basic once). It is not possible to work with a token that is updated on every request. It doesn't make sense.

If the token is refreshed on every request, the request to get token will cause the token to be refreshed. So in this case you will need to ask the site for a new token, which will cause the token to be refreshed again.

If the token is valid for a particular file signature, it's impossible to support that many "token-signature" connections.

I think there is something "key" that allows you bypass authorization. But I don't know what exactly.

AAndyProgram commented 1 year ago

Hm... RedGifs seems to still work. RedGifs media posted on Reddit download without problems. The authorization error only occurs when loading a RedGifs profile. It can't get data for request https://api.redgifs.com/v2/users/UserName/search?order=recent

CypherpunkSamurai commented 1 year ago

If the token is refreshed on every request

If you notice AUTH_TOKEN variable is being accessed from a static context. Which means it's kept in static. It's not updated on every request.

That's cause it will be fetched only once. I'm yet to rewrite the whole module, as the regex don't work, the jsoup extractors don't work, and it's confusing.

CypherpunkSamurai commented 1 year ago

RedGifs media posted on Reddit download without problems.

Reddit has 1st class support, reddit json contains direct links to mp4 and fallback urls most of the times.

CypherpunkSamurai commented 1 year ago

Document h and Document j is HtmlDocument or what?

Document h is redgifs website html page, j is the index_.js file from the html. It contains token.

CypherpunkSamurai commented 1 year ago

I think there is something "key" that allows you bypass authorization. But I don't know what exactly.

Using older API bypasses authentication. Reddit I think is still using old API. The new V2 api requires client to be valid.

AAndyProgram commented 1 year ago

It's not updated on every request.

No, it is. It's updated on every profile and/or post request.

Reddit has 1st class support, reddit json contains direct links to mp4 and fallback urls most of the times.

I didn't see RedGifs mp4 in Reddit responses.


the regex don't work the jsoup extractors don't work, and it's confusing.

and

It contains token.

What's that supposed to mean? If you don't have a solution, why did you post here as if you fixed something?!

Using older API bypasses authentication. Reddit I think is still using old API. The new V2 api requires client to be valid.

Again. I didn't see RedGifs mp4 in Reddit responses. My program parses RedGifs URLs posted on Reddit using the RedGifs v2 API! I posted a particular request that doesn't work. Requests to get posts are working fine.

CypherpunkSamurai commented 1 year ago

Hi Andy, please check that the api i mentioned and reverse engineered is atleast a few months old, their api changed drastically. Even your program has got broken because of it.

I shared what i found here for future reference. If you can rewrite the module yourself in java be my guest :D

AAndyProgram commented 1 year ago

Even your program has got broken because of it.

All RedGifs parsers broke when they did that.

Before shouting and appealing know it all, please check that the api i mentioned

Did you even see what you wrote? You posted the code and a few messages later said that your regex doesn't work anymore, your code doesn't work...

What should I test if you post broken things?! Either your code works or it doesn't! It's simple. Don't mislead users and don't waste our time.

I'm having trouble getting an array of user data. A specific post can still be parsed using cookies and a token (https://api.redgifs.com/v2/gifs/PostID?views=yes&users=yes)

CypherpunkSamurai commented 1 year ago

All RedGifs parsers broke when they did that.

By all you mean your parser broke, cause i didn't see anyone using the v2 api with token recently. Until the reddit post, and until someone mentioned this issue in your repo

Did you even see what you wrote? You posted the code and a few messages later said that your regex doesn't work anymore, your code doesn't work...

The existing repo code (latest commit) doesn't work anymore, not my code. You missing context awareness.

Either your code works or it doesn't! It's simple. Don't mislead users and don't waste our time.

Instead of shouting in other's repo issue you should try learn java and write this module yourself i guess.

AAndyProgram commented 1 year ago

cause i didn't see anyone using the v2 api with token recently

My program works with token and v2 API. It just can't get a list of user's post. But with a specific post it works!

CypherpunkSamurai commented 1 year ago

This isn't a matter of "My Program works, your don't". That's just being toxic teenager online.

If you've figure it all out, great. Congratulations. Now feel free to learn java and contribute instead of showing off.

Thank you, Regards

AAndyProgram commented 1 year ago

The conversation is off. You don't see what you wrote. You don't see what I wrote. Where is toxic? I said where is the problem (in getting a list of user posts). You are a spammer or a troll.

P.S. I won't learn java to write code for you. Feel free to learn Visual Vasic or C#!

CypherpunkSamurai commented 1 year ago

Coming over to another repo to show off your program is superiour is what a toxic teenager would do. As i said, this is an opensource project, instead of shilling "I fixed it before you guys" you can contribute.

If you can't contribute maybe you should not speak on another repo's issue where no one tagged you and asked your opinion.

Also, P.S. I don't need to learn C#, VB 2006, VB.Net and WPF again to prove my point, as i've learnt it in 2015. Thank you for suggestion, I don't consider your opinions valuable in this thread.

altbdoor commented 1 year ago

FWIW, there seems to be a notion of "temporary tokens" now, as documented in https://github.com/Redgifs/api/wiki/Temporary-tokens

The token provided in the response, is then used as the Authorization: Bearer {{ token }} header, for subsequent API calls to RedGifs.

At least in old Reddit UI, that's what it is doing too.

CypherpunkSamurai commented 1 year ago

Yes, I just checked yesterday, the javascript embedded token is no more there.

It seems they updated their backend and have a new api endpoint for temporary tokens. The redgifs v1 api url now returns route not found. It's completely deprecated now. Their swagger ui have updated too

This temporary endpoint provides a different kind of jwt, here's the jwt decoded. Not only did they add issue at time (iat), and token exp time, but also the client ip.

Solutions ~1. Weirdly enough, the contentUrl in the json embedded in each page html is now working, unlike before where it would fail to verify signature (working, provided we remove the amp; strings). We can just revert back to html and json parsing? (more testing needed)~ [Not working]

  1. We can fetch temporary tokens each time.
  2. We can use login cookies of a throwaway account, and refresh it before session.
  3. Use a GPU to crack the JWT HSA key? (not very great solution)
altbdoor commented 1 year ago

but also the client ip

Yea, at one point they were already using the IP address and user agent to generate the token. Only reference I found about this is in https://old.reddit.com/r/StellarOSX/comments/x3iddd/regarding_redgifs_and_your_privacy_next_update/

This is more or less an extra info, since I made a personal proxy to RedGifs.

All the media URL returned now have a signature in the URL. It appears that this signature is somehow cross checked with the token, since the proxy's details (e.g., user agent curl, and IP 1.2.3.4) would be different from the browser's, and the media load would fail.

ghost commented 1 year ago

Looks like @SpartanJ was able to fix it in a different project

https://github.com/SpartanJ/ImgurViewer/commit/a8b1c9cdf1f67dce6cb8fb2e5fddb2724713599a

CypherpunkSamurai commented 1 year ago

I hope they have completed making changes to their APIs and it's stable now. need to confirm beforehand, or we run into problems like before