patrickkfkan / patreon-dl

Patreon Downloader
50 stars 3 forks source link

Cannot download initial page : Cloudflare captcha #25

Open lautriva opened 4 weeks ago

lautriva commented 4 weeks ago

Hi, patreon-dl was working fine until this week I have an error (see below)

After dumping the resulting html I got something like <!DOCTYPE html><html lang="en-US"><head><title>Just a moment...</title></body></html> Which made me think the script got locked by the Cloudflare captcha

Maybe it should use the provided cookie value to bypass the captcha?

Complete log

Jun 18 19:40:56: info: PostDownloader: Targeting posts by 'CREATOR_NAME'
Jun 18 19:40:56: debug: PostsFetcher: Fetch initial data from "https://www.patreon.com/CREATOR_NAME"
Jun 18 19:40:56: debug: PostsFetcher: next() requested (0)
Jun 18 19:40:56: debug: PageParser: Parse initial data from https://www.patreon.com/CREATOR_NAME
Jun 18 19:40:56: debug: PageParser: Trying pattern: /window\.patreon\s*?=\s*?({.+?});/gm
Jun 18 19:40:56: debug: PageParser: No match for pattern: /window\.patreon\s*?=\s*?({.+?});/gm
Jun 18 19:40:56: debug: PageParser: Trying pattern: /<script id="__NEXT_DATA__" type="application\/json">(.+)<\/script>/gm
Jun 18 19:40:56: debug: PageParser: No match for pattern: /<script id="__NEXT_DATA__" type="application\/json">(.+)<\/script>/gm
Jun 18 19:40:56: error: PostsFetcher: Error parsing initial data from "https://www.patreon.com/CREATOR_NAME": Initial data not found - no regex matches
Jun 18 19:40:56: debug: PostsFetcher: next() handled (0)
Jun 18 19:40:56: info: PostDownloader: Done downloading posts by 'CREATOR_NAME'
Jun 18 19:40:56: info: PostDownloader: Total 0 / undefined posts processed
Jun 18 19:40:56: info: PostDownloader end
jmurchie88 commented 3 weeks ago

Ran into the same problem today, I initially attempted running patreon-dl from a remote server from where the cookie was generated and was met with the issue @lautriva is describing. Went back to running the project from my local machine (same IP/geography the cookie was generated from) and everything worked as expected. It seems they may be performing some level of locking the cookies to an IP or geography. Not sure this is a problem the project can solve but wanted to provide my anecdote.