Open meliamne opened 3 years ago
Please run app with "--json" command line option and send generated .json files (you can find them in the download folder) to alexcsdev@protonmail.com
I wanted to add that I am having this same issue.
I wanted to add that I am having this same issue.
Please send the json dump of the affected creator to the email above. I will check it when I have free time.
I am also getting this error. Seems to be related to when the creator adds a link to the content via the patreon uploader, rather than as a proper attachment. Not sure if this is some extended functionality patreon added in the past year or just some creators being weird with how they do it.
In particular, this is for zip files, usually containing a bunch of images, psd files, etc.
I'll email the json files to the email indicated above, but seems to boil down to this:
"content": "<p><strong>42 region icons in elven style </strong>for marking the important locations on your Elven Kingdom!</p><ul> <li>2 Color variations</li> <li>Towns, villages, castles, keeps and more</li> <li>Standard markers for important quest locations and heroes' party position.</li></ul><p><a href=\"https://www.patreon.com/file?h=35395828&i=9247121\">Elven Region Icons [ALL PATRONS]</a>\u00a0</p>",
...
"attachments": {
"data": []
},
@begna112 Interesting... Yep, send me the files, I will definitely look into that once I have some free time.
Sent them along.
For what it's worth, I'm 90% certain the original report here is the same. A bunch of in-line links to zip files for battlemaps, rather than proper attachments. https://www.patreon.com/posts/elven-ruins-45879210
Might need to specifically validate in-post links of this Patreon url format as not being "ExternalUrl" types and how to handle them.
Any updates on this?
Just adding to this, having the same problem. I can open the link myself fine and download the file through the browser by just visiting it, but PatreonDownloader doesn't seem to follow through.
I've been quite busy lately. I'm afraid I can't provide any ETA for this right now.
Hi, any updates on this? I've been trying to download battlemaps as mentioned above and any Patreon file links aren't working.
I realize there is no estimation, but just wanted to check in to see if this was still on your radar.
I tried taking a look myself but was having trouble figuring out the project architecture. If I'm understanding correctly though, it may be just as simple as parsing the in-line links of this format through the same process as the attached files, since they seem to use the same url format.
Unfortunately the harsh truth is that if the issue does not affect me or does not completely break the application the answer is "this will be done when I fell like doing it unless someone is willing to pay for it".
I have spent quite a lot of my free time working on #125, so for now I can't dedicate any more of it to this project unless something breaks completely.
Unfortunately the harsh truth is that if the issue does not affect me or does not completely break the application the answer is "this will be done when I fell like doing it unless someone is willing to pay for it".
Pay for it? How and how much? @AlexCSDev
Edit: Actually, trying to determine if this issue still exists.
Edit 2: yes, it still exists. example - https://www.patreon.com/posts/83764955 https://www.patreon.com/posts/souls-plane-83764955
2023-06-08 02:47:09.4627 ERROR Failed to download https://www.patreon.com/file?h=35017899&i=14336710: Error while downloading https://www.patreon.com/file?h=35017899&i=14336710: [83764955] Unable to retrieve name for external entry of type ExternalUrl: https://www.patreon.com/file?h=35017899&i=14336710
So here is where the error is being thrown: https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonCrawledUrlProcessor.cs#L113-L120
Here is where any url inside the content of the post is being set as an "externalurl":
https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonPageCrawler.cs#L216-L224
This probably should be altered to detect patreon links following the https://www.patreon.com/file
pattern and set to PostAttachment. But I'm not certain that this would solve the issue.
This isn't set to allow redirects:
https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonRemoteFilenameRetriever.cs#L26
https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonRemoteFilenameRetriever.cs#L49-L68
I think this is most likely where the bug lies. The https://www.patreon.com/file
links are a 302 redirect to a url like https://c10.patreonusercontent.com/4/patreon-media/p/post/27184633/3d5db0f1843844a883ad68643ed924b2/eyJhIjoxLCJwIjoxfQ%3D%3D/1?token-time=1686528000&token-hash=
.
The HttpClient
receives a proper response, so it isn't throwing an HTTPRequestException
but it also doesn't have a Content-Disposition
, so it never populates the filename
variable with anything other than null
.
If the HttpClient
were to follow the redirect, it would have a Content-Disposition
and would be able to retrieve the filename.
I think this could be potentially solved by allowing the redirect like this: https://briancaos.wordpress.com/2021/09/06/httpclient-follow-302-redirects-with-net-core/
The only thing I'm not really sure of is why this regex isn't matching on the url: https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonRemoteFilenameRetriever.cs#L24
if it did match, it would at least get assigned a filename based on the irl here: https://github.com/AlexCSDev/PatreonDownloader/blob/aaaaf9291c513912eb46aba9a8b4c6646972401f/PatreonDownloader.Implementation/PatreonRemoteFilenameRetriever.cs#L75
Edit: figured out that the regex is trying to detect a filename pattern in the url, not a valid url. So, that makes sense, though it's misleading with the comment saying it's an "invalid url"
I tried to implement this myself but am getting a 403 error from cloudflare. Maybe that was the root problem all along? I added the redirect and a debug line to output the request and the response to strings. I think that the request is missing cookies and needs to use the IWebDownloader
cookies that are used elsewhere. I don't understand .net enough to know how to get the cookies you're saving elsewhere into this client.
2023-06-08 06:01:13.9540 DEBUG [PatreonDownloader.Implementation.PatreonRemoteFilenameRetriever] Method: GET, RequestUri: 'https://www.patreon.com/file?h=35017899&i=14336710', Version: 1.1, Content: <null>, Headers:
{
}
2023-06-08 05:27:52.2296 DEBUG [PatreonDownloader.Implementation.PatreonRemoteFilenameRetriever] StatusCode: 403, ReasonPhrase: 'Forbidden', Version: 1.1, Content: System.Net.Http.HttpConnectionResponseContent, Headers:
{
Date: Thu, 08 Jun 2023 10:27:52 GMT
Transfer-Encoding: chunked
Connection: keep-alive
CF-Ray: 7d4071db98ec2cb0-DFW
CF-Cache-Status: DYNAMIC
Cache-Control: private
Set-Cookie: a_csrf=WjQaO_rQQbeZjmgcSlV4Eiez2A8uYxNKrcFBpeXr0mE; Domain=patreon.com; Expires=Thu, 08-Jun-2023 11:27:52 GMT; Max-Age=3600; Secure; HttpOnly; Path=/
Set-Cookie: patreon_locale_code=en-US; Domain=patreon.com; Expires=Wed, 03-Jun-2043 10:27:52 GMT; Max-Age=630720000; Secure; Path=/
Set-Cookie: patreon_location_country_code=US; Domain=patreon.com; Expires=Wed, 03-Jun-2043 10:27:52 GMT; Max-Age=630720000; Secure; Path=/
Set-Cookie: patreon_device_id=889737a2-f4c7-4cbc-979a-e044ae0d07e8; Domain=patreon.com; Expires=Thu, 01-Aug-2040 00:00:00 GMT; Max-Age=630720000; Path=/
Set-Cookie: patreon_location_country_code=US; Domain=patreon.com; Expires=Thu, 01-Aug-2040 00:00:00 GMT; Max-Age=630720000; Path=/
Set-Cookie: patreon_locale_code=undefined; Domain=patreon.com; Expires=Thu, 01-Aug-2040 00:00:00 GMT; Max-Age=630720000; Path=/
Set-Cookie: __cf_bm=CcfmoX8LOfw24DqxuaAH7bVvRy0NnkhnlUlYDexoK8Y-1686220072-0-AWPN37kHbX1/PyyMbLufWbudZ68cLXrttr3B5Abw/499Qw2C8wwEZ4DvggLeSuQbAMtdR2dLLw8uyb2rMix4xXwdDqIJlNrIPwc8z7trXwE7; path=/; expires=Thu, 08-Jun-23 10:57:52 GMT; domain=.patreon.com; HttpOnly; Secure
Strict-Transport-Security: max-age=2592000
Referrer-Policy: origin,strict-origin-when-cross-origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
x-patreon-sha: ab2f439ffbd53097445c4ae0fd8f2ac3e4ccaee6
x-patreon-uuid: 72c3cb25-18b5-5248-8a77-3da75c089e1a
X-XSS-Protection: 1; mode=block
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=ReQXP6fwUVt6q6Ap82mKAp0zEn0KG3MMoVAvnJYwigPr2s9QjaLkU0R4rIc30Po1M0BsIzxe3Kwcdv666me8H%2FjtHmooA25rin0HLAs%2BT2MKKNYrzBt6J8x0zoatiZRZVg%3D%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Server: cloudflare
Content-Type: text/html; charset=utf-8
Content-Language: en-US
}
2023-06-08 05:27:52.2296 ERROR [UniversalDownloaderPlatform.Engine.DownloadManager] Error while downloading https://www.patreon.com/file?h=35017899&i=14336710: [83764955] Unable to retrieve name for external entry of type ExternalUrl: https://www.patreon.com/file?h=35017899&i=14336710
2023-06-08 05:27:52.2296 ERROR [PatreonDownloader.App.Program] Failed to download https://www.patreon.com/file?h=35017899&i=14336710: Error while downloading https://www.patreon.com/file?h=35017899&i=14336710: [83764955] Unable to retrieve name for external entry of type ExternalUrl: https://www.patreon.com/file?h=35017899&i=14336710
an expected response should be something like
Request URL:
https://www.patreon.com/file?h=35017899&i=14336710
Request Method:
GET
Status Code:
302
Remote Address:
104.16.7.49:443
Referrer Policy:
strict-origin-when-cross-origin
Cache-Control:
private
Cf-Cache-Status:
DYNAMIC
Cf-Ray:
7d405ca40d2ee7cf-DFW
Content-Language:
en-US
Content-Type:
text/html; charset=utf-8
Date:
Thu, 08 Jun 2023 10:13:23 GMT
Location:
https://c10.patreonusercontent.com/4/patreon-media/p/post/35017899/ccf1e1c2b1164c07b3ab347dbcc6596c/eyJhIjoxLCJwIjoxfQ%3D%3D/1?token-time=1686528000&token-hash=16wLo1nHMI5dkNuXI6SrylYXSiumTq6lt10XuJBfZ_I%3D
Nel:
{"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Referrer-Policy:
origin,strict-origin-when-cross-origin
Report-To:
{"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=azzB4wZoQTCtmVdgukCWEX0T9vWxyOicBRQBValw8h%2FMXeJ%2Bbu3hiE%2F0e044S%2FwjXpbWDfj3loTczb2lh853aXbkeT9MR1edmsRfRRBGMvTLLrNaaKaU%2BBaY60Dku27oyw%3D%3D"}],"group":"cf-nel","max_age":604800}
Server:
cloudflare
Set-Cookie:
AWSALBTG=bzdf7KmRnw6p+uOnZEsPjuj4WChaERMK6eidGYPxCRQ4PwA5rEzKsOJCEJ9VT96zJEUqYT0XjsZl7SlxkledGguHEzbCNXd9G+2V5bECfuYD9mHOFa5F9SZFxHhJYRJZYGLG1jRfeZpYV9iE0kt0V4psnxltaHoKSqGiI/CIUEN8soJn2RZXA5EOVUnVc74Z8uO8IPdrjLEm6H1tQtoiUYSsxGK7HrPcWAZhKT795RgrAu26+qVtFsvU5DP70rQI5BAqAeg=; Expires=Thu, 15 Jun 2023 10:13:23 GMT; Path=/
Set-Cookie:
AWSALBTGCORS=bzdf7KmRnw6p+uOnZEsPjuj4WChaERMK6eidGYPxCRQ4PwA5rEzKsOJCEJ9VT96zJEUqYT0XjsZl7SlxkledGguHEzbCNXd9G+2V5bECfuYD9mHOFa5F9SZFxHhJYRJZYGLG1jRfeZpYV9iE0kt0V4psnxltaHoKSqGiI/CIUEN8soJn2RZXA5EOVUnVc74Z8uO8IPdrjLEm6H1tQtoiUYSsxGK7HrPcWAZhKT795RgrAu26+qVtFsvU5DP70rQI5BAqAeg=; Expires=Thu, 15 Jun 2023 10:13:23 GMT; Path=/; SameSite=None; Secure
Set-Cookie:
patreon_locale_code=en-US; Domain=patreon.com; Expires=Wed, 03-Jun-2043 10:13:23 GMT; Max-Age=630720000; Secure; Path=/
Set-Cookie:
patreon_location_country_code=US; Domain=patreon.com; Expires=Wed, 03-Jun-2043 10:13:23 GMT; Max-Age=630720000; Secure; Path=/
Strict-Transport-Security:
max-age=2592000
Vary:
Accept-Encoding
X-Content-Type-Options:
nosniff
X-Frame-Options:
sameorigin
X-Patreon-Sha:
ab2f439ffbd53097445c4ae0fd8f2ac3e4ccaee6
X-Patreon-Uuid:
7adf1842-eb32-527b-a645-50c9678a3895
X-Xss-Protection:
1; mode=block
I noticed that the request headers (from my browser) include cookies:
:Authority:
www.patreon.com
:Method:
GET
:Path:
/file?h=35017899&i=14336710
:Scheme:
https
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding:
gzip, deflate, br
Accept-Language:
en-US,en;q=0.9
Cookie:
__cf_bm=wPLhS6I9EmMSDDTYVSqRdXYpz.T3.C9HzVF1zYwP1Ik-1686218911-0-ATMa3ycREa6K9rsFAmWDRDkIXcIqF6U5VDqUuNuNqz6Jb/4PEGQ9cJanuHwVgbXsykRE7W+L; patreon_device_id=cd9eb84d-8c95-47ab-8e79-71d; patreon_location_country_code=US; patreon_locale_code=en-US; _ALGOLIA=anonymous-85a84d3e-ca44-458e-980a-d3a; a_csrf=dsKcV3FQJLkuW0ZLXnD20FmiEdh5-alUE; session_id=wb6w3_4fmVwDsETyDWFSGtmbyqX0; _swb_consent_=eyJlbnZpcm9ubWVudENvZGUiOiJwcm9kdWN0aW9uIiwiaWRlbnRpdGllcyI6eyJwYXRyZW9uYWNjdGlkIjoiMzkzMDE4NSIsInBhdHJlb25kZXZpY2VpZCI6ImNkOWViODRkLThjOTUtNDdhYi04ZTc5LTcxMTM2NmE0YjY2ZCJ9LCJqdXJpc2RpY3Rpb25Db2RlIjoidXNnZW5lcmFsIiwicHJvcGVydHlDb2RlIjoicGF0cmVvbiIsInB1cnBvc2VzIjp7ImFuYWx5dGljc2JpemVuaGFuY2UiOnsiYWxsb3dlZCI6InRydWUiLCJsZWdhbEJhc2lzQ29kZSI6ImRpc2Nsb3N1cxhd3MiOnsiYWxsb3dlZCI6InRydWUiLCJsZWdhbEJhc2lzQ29kZSI6ImRpc2Nsb3N1cmUifSwic3Vic2NyaWJlZHN2Y3MiOnsiYWxsb3dlZCI6InRydWUiLCJsZWdhbEJhc2lzQ29kZSI6ImRpc2Nsb3N1cmUifSwic3VydmV5b3V0cmVhY2giOnsiYWxsb3dlZCI6InRydWUiLCJsZWdhbEJhc2lzQ29kZSI6ImNvbnNlbnRfb3B0b3V0In0sInRhcmdldGVkYWR2ZXJ0aXNpbmciOnsiYWxsb3dlZCI6InRydWUiLCJsZWdhbEJhc2lzQ29kZSI6ImRpc2Nsb3N1cmUifX0sImNvbGxlY3RlZEF0IjoxNjg2MjE4OTIzfQ%3D%3D; AWSALBTG=8Z7SPHtypB1WzoroJ8+kkQdOPRT1YjwfE3E5GEXU+i9jeeV0gFX+cn5B/2nlWybN9jfgHH6YBXNUIJh2QRoz55UkNLZQbheu4O6hyBSjprx5yJva02kYFml3KJT4TvFsb+GAyFSUzQdCiK76mT3pWV1ziqcqT6fT0xgYC7ZjnVjl5HBWTFhgob8GXXdjkMEEM67OcpLOPhG0GvRDr2huFmPjp5w0tp9IUSmXCDX3E5GEXU+i9jeeV0gFX+cn5B/2nlWybN9jfgHH6YBXNUIJh2QRoz55UkNLZQbheu4O6hyBSjprx5yJva02kYFml3KJT4TvFsb+GAyFSUzQdCiK76mT3pWV1ziqcqT6fT0xgYC7ZjnVjl5HBWTFhgob8GXXdjkMEEM67OcpLOPhG0GvRDr2huFmPjp5w0tp9IUSmXCDXD3e1Xvmj2QwraaQqhbUeVVkRAMC4YUNE=
Referer:
https://www.patreon.com/posts/souls-plane-83764955
Sec-Ch-Device-Memory:
8
Sec-Ch-Ua:
"Not.A/Brand";v="8", "Chromium";v="114", "Google Chrome";v="114"
Sec-Ch-Ua-Arch:
"x86"
Sec-Ch-Ua-Full-Version-List:
"Not.A/Brand";v="8.0.0.0", "Chromium";v="114.0.5735.118", "Google Chrome";v="114.0.5735.118"
Sec-Ch-Ua-Mobile:
?0
Sec-Ch-Ua-Model:
""
Sec-Ch-Ua-Platform:
"Windows"
Sec-Fetch-Dest:
document
Sec-Fetch-Mode:
navigate
Sec-Fetch-Site:
same-origin
Sec-Fetch-User:
?1
Upgrade-Insecure-Requests:
1
User-Agent:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
Getting a bunch of these instead of the zip files hoped for, Normal download via browser works.
Error while downloading https://www.patreon.com/file?h=37188912&i=7097946: [45879210] Unable to retrieve name for external entry of type ExternalUrl: https://www.patreon.com/file?h=37188912&i=7097946