Open Kribli-krabli opened 3 weeks ago
Thank you. Will try to look at it eventually.
Can't access 403 this content right now, since my subscription is expired. Yet I've added some code to facilitate debugging.
You can try to run the app with --title "Урок 124" --debug
Error message will contain full data for a bogus file. Then you can try to follow the links, you find in that data.
Please check whether any of those links work (won't trigger 404). If any, please, report it here alonside with the debug data.
INFO : Searching data for https://sponsr.ru/uzhukoffa_lessons/ ...
DEBUG : Starting new HTTPS connection (1): sponsr.ru:443
DEBUG : https://sponsr.ru:443 "GET /uzhukoffa_lessons/ HTTP/1.1" 200 None
DEBUG : Project ID: 248
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=0 HTTP/1.1" 200 None
DEBUG : Searched 20/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=20 HTTP/1.1" 200 None
DEBUG : Searched 40/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=40 HTTP/1.1" 200 None
DEBUG : Searched 60/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=60 HTTP/1.1" 200 None
DEBUG : Searched 80/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=80 HTTP/1.1" 200 None
DEBUG : Searched 100/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=100 HTTP/1.1" 200 None
DEBUG : Searched 120/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=120 HTTP/1.1" 200 None
DEBUG : Searched 140/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=140 HTTP/1.1" 200 None
DEBUG : Searched 160/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=160 HTTP/1.1" 200 None
DEBUG : Searched 180/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=180 HTTP/1.1" 200 None
DEBUG : Searched 200/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=200 HTTP/1.1" 200 None
DEBUG : Searched 220/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=220 HTTP/1.1" 200 None
DEBUG : Searched 240/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=240 HTTP/1.1" 200 None
DEBUG : Searched 260/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=260 HTTP/1.1" 200 None
DEBUG : Searched 280/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=280 HTTP/1.1" 200 None
DEBUG : Searched 300/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=300 HTTP/1.1" 200 None
DEBUG : Searched 320/335 ...
DEBUG : https://sponsr.ru:443 "GET /project/248/more-posts/?offset=320 HTTP/1.1" 200 None
DEBUG : Searched 335/335 ...
INFO : Found articles: 1
INFO : Start dump using preference: VideoPreference(frame='640x360', sound='best') ...
INFO : Configuration is loaded from sponsrdump.json ...
INFO : [1/1 100.0%] Downloading File 5572 [lessons124.mp3]: ...
DEBUG : Starting new HTTPS connection (1): media.sponsr.ru:443
DEBUG : https://media.sponsr.ru:443 "GET /project/248/post/17249/file/5572/lessons124.mp3?token=*** HTTP/1.1" 404 None
DEBUG : { '__idx': 1,
'file_category': 'podcast',
'file_created_at': '2022-07-09 00:55:53.000000',
'file_duration': None,
'file_exist': True,
'file_id': 5572,
'file_link': 'https://media.sponsr.ru/project/248/post/17249/file/5572/lessons124.mp3',
'file_mime': 'audio/mpeg',
'file_order': None,
'file_path': 'https://media.sponsr.ru/project/248/post/17249/file/5572/lessons124.mp3?token=***',
'file_preview_link': 'https://media.sponsr.ru/project/248/post/17249/file/5572/lessons124.mp3?preview=1',
'file_preview_path': None,
'file_size': 71111708,
'file_title': 'lessons124.mp3'}
Traceback (most recent call last):
File "/home/papa/Sponsor/./sponsrdump.py", line 678, in <module>
dumper.dump(
File "/home/papa/Sponsor/./sponsrdump.py", line 621, in dump
self._download_file(filepath, dest=dest_filename, prefer_video=prefer_video)
File "/home/papa/Sponsor/./sponsrdump.py", line 313, in _download_file
response.raise_for_status()
File "/usr/lib/python3/dist-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://media.sponsr.ru/project/248/post/17249/file/5572/lessons124.mp3?token=***
I can buy a subscription 4 u.
I can buy a subscription 4 u.
Thank you, but don't bother.
Please try to open the link from file_path
in browser. Also please try to do the same with the link from file_link
.
Hm. So this seems to not to be related to the dumper itself. This needs some further investigation to find a workaround if any at all.
We could add an option to skip such errors if it can be considered suitable.
But already now you can skip audio duping using --no-audio
option.
I think it's a "re-upload" problem on the part of the site. The last file from the list was successfully downloaded, but it returns 404 for the previous 4 entries.
The behavior on the site is similar:
It seems like we shouldn't see these 4 previous entries, maybe bug of sponsr.
You can use this workaround to ignore such errors: https://github.com/SnipGhost/sponsrdump/commit/cd79c8eeb428f578dcd23e5a59b8ac741a31c241
Tested on my NAS with these changes:
I guess if the video has few mp3 recordings, an error is returned: Client error 404: URL not found: https://media.sponsr.ru/project/248/post/17249/file/5572/lessons124.mp3.....