ripmeapp2 / ripme

Downloads albums in bulk
MIT License
310 stars 38 forks source link

RedGIFs downloader broken - new video URLs require fixes\changes to parser #108

Closed TarybleTexan closed 2 months ago

TarybleTexan commented 2 years ago

Expected Behavior

Should download the video.

Actual Behavior

Download fails. RedGIFs changed their download system to add several tokens to the end of the url. However, stripping off everything in the video URL after "mp4", and then replacing "thumbs4" with "thumbs3", should work.

Logs from the app window log are below.

Downloading https://www.redgifs.com/watch/gloomycorruptlabradorretriever
Downloading next page
Downloading https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142
https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142 : Non-retriable status code 400 while downloading https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142

Logs from the log file are below.

Getting key tray.hide in en_US value Hide
Getting key tray.show in en_US value Show
Getting key tray.hide in en_US value Hide
Found album ripper: com.rarchives.ripme.ripper.rippers.RedgifsRipper
Saved configuration to F:\Ripper\rip.properties
Getting key queue in en_US value Queue
Getting key queue in en_US value Queue
Found album ripper: com.rarchives.ripme.ripper.rippers.RedgifsRipper
Using album title 'redgifs_gloomycorruptlabradorretriever'
[+] Creating directory: ..\..\Ripper
Set working directory to: F:\Rips3\redgifs_gloomycorruptlabradorretriever
Initializing Main thread pool with 10 threads
Retrieving https://www.redgifs.com/watch/gloomycorruptlabradorretriever
Trying to load cookies from config for www.redgifs.com
Trying to load cookies from config for redgifs.com
Found image url #1: https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142
url: https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142, subdirectory, referrer: null, cookies: null, prefix: 001_, fileName: null
Downloading https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142 to F:\Rips3\redgifs_gloomycorruptlabradorretriever\001_GloomyCorruptLabradorretriever.mp4
Waiting for threads to finish
    Downloading file: https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142 Retry #1
Getting key request.properties in en_US value Request properties
Request properties: {Cookie=[], User-agent=[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36], accept=[*/*]}
Status code: 400
Getting key nonretriable.status.code in en_US value Non-retriable status code
[!] Non-retriable status code 400 while downloading from https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1662087600&signature=bc89104fee6890402b34758c264b997124c04c2af575f1407c7e7850335fdfa1&for=66.68.96.142
Getting key nonretriable.status.code in en_US value Non-retriable status code
   Rip completed!
Deleting empty directory F:\Rips3\redgifs_gloomycorruptlabradorretriever
Getting key open in en_US value Open
Getting key tray.show in en_US value Show
keokitsune commented 2 years ago

i don't even know if this program is still being developed. its been like 4 months with no activity.

terminalwhoami commented 2 years ago

I think @soloturn was the one who was developing it, but I haven't seen him comment on issues in a while

soloturn commented 2 years ago

this is open source, so anybody can create a commit and it automatically builds a new version :)

terminalwhoami commented 2 years ago

this is open source, so anybody can create a commit and it automatically builds a new version :)

I wish i had the programming skills to do something like that, but alas I don't.

madhatr commented 2 years ago

I'll see if I can patch it. Though I am junk at following through. I also haven't opened my github desktop or ms code in over half a year i think.

TarybleTexan commented 2 years ago

Don't bother - they've completely killed the ability to download videos via deep link.  It takes a lot more correction now than when I created the ticket.- T. J. SpackmanSent from mobile, apologies for typosOn Oct 20, 2022, at 5:16 PM, MegaManZero @.***> wrote: I'll see if I can patch it. Though I am junk at following through. I also haven't opened my github desktop or ms code in over half a year i think.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

MikeRich88 commented 2 years ago

So its not that hard to fix. Note that my particular use case is scraping Reddit, which does not make use of the Redgifs API

  1. Instead of contentUrl, use the content value of the first tag. (The page usually has 2. The second one seems to always be for mobile. For robustness you could probably scrape both and then use the non-mobile one). If only one such tag exists and it is a mobile URL that means there is no HD version.

  2. Replace all & in the URL with &

  3. Other than that, you don't do anything else to the URL. Removing "-mobile" from it, invalidates the signature.

  4. The requests to www.redgifs.com and thumbs4.redgifs.com MUST be from the same IP address (And therefore IP protocol version). Unless you force IPv4 or IPv6 for both requests, it is possible for one to use IPv4 and the other to use IPv6, which invalidates the URL signature.

  5. Both requests MUST use the exact same user agent. Currently it can be set to anything, just as long as it's the same for both requests.

6. For some reason (at least in my testing), both requests MUST use HTTP/2. If either one uses HTTP/1.1 then the download will fail. See below for workaround.

  1. Sometimes you still get a bad download URL anyway because CloudFlare is sending a cached result that is wrong. To work around this add a random variable to the URL of the first request. This also fixes the HTTP protocol issue. ("CF-Cache-Status" header will say "MISS" instead of "HIT" to know it worked.)

If all of these are enforced then you can successfully download the video.

Here is a really terrible bash one liner that works for me on macOS (using bash) as a proof of concept.

curl -4 -v "https://www.redgifs.com/watch/INSERT_VALID_VIDEO_ID_SLUG_HERE?r=$RANDOM" | sed 's_<_|<_g' | tr '|' '\n' | fgrep ".mp4" | head -n 1 | cut -d '"' -f 4 | sed 's_\&amp;_\&_g' | xargs -n 1 -I {} curl -v -O -4 "{}"

madhatr commented 2 years ago

EDIT: I added the semi colon ";" to the sanitize function and I get a 403 instead of a 400 http error code now so that's progress i think. No idea where those semi colons are coming from i'll have to look around.

Does anyone know if these semi colons matter? I don't see them in the browser URL but in the logger code ive added they are in the URL string. I added *** incase the sig is unique to me or something.

https://thumbs4.redgifs.com/GloomyCorruptLabradorretriever.mp4?expires=1666983600&;signature=**0076d98a22bfcec0322d68a6f0**********384bf060a4efaa152686e937**&;for=MYIPADDRESS : Non-retriable status code 400 while downloading

I am wondering if they are messing the redgif download or they are normal in java url code? As you can see I've gotten rid of amp adding to and using the sanitizeurl function.

madhatr commented 2 years ago

Oh I got that URL in this topic to work, I had to get rid of the code that deleted the "-mobile" tag

Though I guess this means no HD unless I can figure that out.

vscum commented 1 year ago

is there still no fix for this? :(

TarybleTexan commented 7 months ago

It was working for a while after an update, but now it's back to giving 403 errors.