Brisppy / twitch-archiver

A simple, fast, platform-independent tool for downloading Twitch streams, videos, and chat logs.
GNU Affero General Public License v3.0
61 stars 6 forks source link

Streams with corruption not saved #20

Closed koroban closed 1 year ago

koroban commented 1 year ago

Describe the bug You have included a check to see if the TS files were downloaded completely and everything can be merged without corrupting. Unfortunately, if the streamer had problems with his internet during the stream, it can happen that these "corruptions" are already streamed to twitch and thus the VoD is never completely downloaded.

Supplied arguments twitch-archiver.py -d /data/twitch -c kimtnt -N -D

Error log

2023-01-30 16:19:45 [    INFO] Merging VOD parts. This may take a while.
2023-01-30 16:21:39 [    INFO] Converting VOD to mp4. This may take a while.
2023-01-30 16:24:07 [   DEBUG] Verifying length of VOD file. ETA: 00:00:00
2023-01-30 16:24:07 [   DEBUG] Downloaded VOD length is 14956. Expected length is 14956.
2023-01-30 16:24:07 [   DEBUG] VOD passed length verification.
2023-01-30 16:24:07 [   DEBUG] Generating readable chat log and saving to disk...
2023-01-30 16:24:07 [   DEBUG] Downloading VOD thumbnail.
2023-01-30 16:24:09 [   DEBUG] Cleaning up temporary files...
2023-01-30 16:24:12 [   DEBUG] Adding VOD info to database.
2023-01-30 16:24:12 [   DEBUG] Database path: /home/downloader/.config/twitch-archiver/vods.db
2023-01-30 16:24:12 [   DEBUG] Connection to SQLite DB successful.
2023-01-30 16:24:12 [   DEBUG] Executing SQL statement: SELECT stream_id, video_archived, chat_archived FROM vods WHERE stream_id IS ?
2023-01-30 16:24:12 [   DEBUG] Values: {'stream_id': '40411186056'}
2023-01-30 16:24:12 [   DEBUG] Executing SQL statement:
INSERT INTO
vods (stream_id, user_id, user_login, user_name, title, description, created_at, published_at, url, thumbnail_url,
      viewable, view_count, language, type, duration, muted_segments, vod_id, store_directory, video_archived,
      chat_archived)
VALUES
(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);

2023-01-30 16:24:12 [   DEBUG] Values: {'stream_id': '40411186056', 'user_id': '81127376', 'user_login': 'kimtnt', 'user_name': 'KimTNT', 'title': 'Immer der Stra�e nach!', 'description': None, 'created_at': '2023-01-30T11:00:30Z', 'published_at': '2023-01-30T11:00:30Z', 'url': 'https://www.twitch.tv/videos/1722915620', 'thumbnail_url': 'https://static-cdn.jtvnw.net/cf_vods/dgeft87wbj63p/ecc1bb460b9554f8b9e8_kimtnt_40411186056_1675076425//thumb/thumb0-%{width}x%{height}.jpg', 'viewable': 'public', 'view_count': 0, 'language': 'de', 'type': 'archive', 'duration': 14956, 'muted_segments': None, 'vod_id': '1722915620', 'store_directory': '/data/twitch/KimTNT/2023-01-30_11-00-30 - Immer der Stra_e nach_ - 1722915620', 'video_archived': True, 'chat_archived': True}
2023-01-30 16:24:12 [   DEBUG] Removing lock file.
2023-01-30 16:24:12 [   DEBUG] Processing VOD 1712901026 by KimTNT
2023-01-30 16:24:12 [   DEBUG] Creating lock file for VOD.
2023-01-30 16:24:12 [   DEBUG] Database path: /home/downloader/.config/twitch-archiver/vods.db
2023-01-30 16:24:12 [   DEBUG] Connection to SQLite DB successful.
2023-01-30 16:24:12 [   DEBUG] Executing SQL statement: SELECT vod_id, video_archived, chat_archived FROM vods WHERE stream_id IS ?
2023-01-30 16:24:12 [   DEBUG] Values: {'stream_id': 40360528968}
2023-01-30 16:24:12 [    INFO] Now processing VOD: 1712901026
2023-01-30 16:24:13 [    INFO] VOD offline.
2023-01-30 16:24:13 [    INFO] Grabbing video...
2023-01-30 16:24:15 [    INFO] Grabbing chat logs...
2023-01-30 16:24:15 [   DEBUG] Grabbing chat logs from offset: 33538
2023-01-30 16:24:16 [    INFO] Found 15 messages.
2023-01-30 16:24:16 [    INFO] Merging VOD parts. This may take a while.
2023-01-30 16:28:35 [    INFO] Converting VOD to mp4. This may take a while.
...
2023-01-30 16:33:53 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3225
2023-01-30 16:33:53 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3226
2023-01-30 16:33:53 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3227
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3228
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3229
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3230
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3231
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3232
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3233
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3234
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3235
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3236
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3237
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3238
2023-01-30 16:33:54 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3239
2023-01-30 16:33:55 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3240
2023-01-30 16:33:55 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3241
2023-01-30 16:33:55 [   DEBUG] Ignoring corrupt packet as part in whitelist. Part: 3242
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 3243
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 3244
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 3245
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 3246
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 324700:11
2023-01-30 16:33:55 [   ERROR] Corrupt packet encountered. Part: 3248
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3249
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 325000:11
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3251
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3252
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3253
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3254
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3255
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 325600:10
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3257
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3258
2023-01-30 16:33:56 [   ERROR] Corrupt packet encountered. Part: 3259
2023-01-30 16:33:57 [   ERROR] Corrupt packet encountered. Part: 3260
2023-01-30 16:34:08 [   ERROR] Corrupt segments found while converting VOD. Attempting to retry parts:
3243, 3244, 3245, 3246, 3247, 3248, 3249, 3250, 3251, 3252, 3253, 3254, 3255, 3256, 3257, 3258, 3259, 3260
...

2023-01-30 16:44:59 [   ERROR] Error downloading VOD 1712901026.: 00:00:00
Traceback (most recent call last):
  File "/home/downloader/twitch-archiver/src/processing.py", line 428, in get_vod_connector
    Utils.convert_vod(vod_json, muted_segments, print_progress=False if self.quiet else True)
  File "/home/downloader/twitch-archiver/src/utils.py", line 274, in convert_vod
    raise CorruptPartError(corrupt_parts, formatted_ranges)
src.exceptions.CorruptPartError: Corrupt parts found when converting VOD file. Parts: ['3243-3260.ts']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/downloader/twitch-archiver/src/processing.py", line 454, in get_vod_connector
    Utils.convert_vod(vod_json, muted_segments, print_progress=False if self.quiet else True)
  File "/home/downloader/twitch-archiver/src/utils.py", line 274, in convert_vod
    raise CorruptPartError(corrupt_parts, formatted_ranges)
src.exceptions.CorruptPartError: Corrupt parts found when converting VOD file. Parts: ['3243-3260.ts']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/downloader/twitch-archiver/src/processing.py", line 457, in get_vod_connector
    raise VodDownloadError(
src.exceptions.VodDownloadError: Video download failed. Error: Corrupt part(s) still present after retrying VOD download. Ensure VOD is still available and either delete the listed #####.ts part(s) from 'parts' folder or entire 'parts' folder if issue persists.

Operating system Linux

Additional context Seems to be a problem regarding this VoD. already tried to download the VoD multiple times and only this one has so many errors. Maybe add a switch to ignore coorupted files?

Brisppy commented 1 year ago

Part corruptions have been an issue I've been tackling for a while. There are dozens of edge-cases which pop up with many not having an automated method of resolving them.

I do already have a patch to fix this issue 6ceea4b7, but it can cause some issues albeit rarely (If corrupt parts intersect with muted segments downloaded with the stream downloader the whole stream may have to be downloaded again).

I'll push this to the develop branch for now while I do some more testing.

koroban commented 1 year ago

Thank you for your help. you're right, it fixed it for this video, but I get following error for muted audio for another video now:

Feb 04 15:15:04 twitchdl python3[369]: 2023-02-04 15:15:04 [   ERROR] Error downloading VOD 1724162425.
Feb 04 15:15:04 twitchdl python3[369]: Traceback (most recent call last):
Feb 04 15:15:04 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 657, in get_vod
Feb 04 15:15:04 twitchdl python3[369]:     self.download.get_m3u8_video(m3u8.loads(vod_playlist), vod_base_url, vod_json['store_directory'])
Feb 04 15:15:04 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/downloader.py", line 64, in get_m3u8_video
Feb 04 15:15:04 twitchdl python3[369]:     str('{:05d}'.format(int(ts_id.split('.')[0].replace('-muted', ''))) + '.ts')))
Feb 04 15:15:04 twitchdl python3[369]: ValueError: invalid literal for int() with base 10: '18-unmuted'
Feb 04 15:15:04 twitchdl python3[369]: During handling of the above exception, another exception occurred:
Feb 04 15:15:04 twitchdl python3[369]: Traceback (most recent call last):
Feb 04 15:15:04 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 393, in get_vod_connector
Feb 04 15:15:04 twitchdl python3[369]:     self.get_vod(vod_json, get_video, get_chat, vod_live)
Feb 04 15:15:04 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 667, in get_vod
Feb 04 15:15:04 twitchdl python3[369]:     raise VodDownloadError(e)
Feb 04 15:15:04 twitchdl python3[369]: src.exceptions.VodDownloadError: Video download failed. Error: invalid literal for int() with base 10: '18-unmuted'
Feb 04 15:15:04 twitchdl python3[369]: 2023-02-04 15:15:04 [    INFO] Now processing VOD: 1720086285
Feb 04 15:15:05 twitchdl python3[369]: 2023-02-04 15:15:05 [    INFO] VOD offline.
Feb 04 15:15:07 twitchdl python3[369]: 2023-02-04 15:15:07 [    INFO] Grabbing video...
Feb 04 15:15:08 twitchdl python3[369]: 2023-02-04 15:15:08 [   ERROR] Error downloading VOD 1720086285.
Feb 04 15:15:08 twitchdl python3[369]: Traceback (most recent call last):
Feb 04 15:15:08 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 657, in get_vod
Feb 04 15:15:08 twitchdl python3[369]:     self.download.get_m3u8_video(m3u8.loads(vod_playlist), vod_base_url, vod_json['store_directory'])
Feb 04 15:15:08 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/downloader.py", line 64, in get_m3u8_video
Feb 04 15:15:08 twitchdl python3[369]:     str('{:05d}'.format(int(ts_id.split('.')[0].replace('-muted', ''))) + '.ts')))
Feb 04 15:15:08 twitchdl python3[369]: ValueError: invalid literal for int() with base 10: '3096-unmuted'
Feb 04 15:15:08 twitchdl python3[369]: During handling of the above exception, another exception occurred:
Feb 04 15:15:08 twitchdl python3[369]: Traceback (most recent call last):
Feb 04 15:15:08 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 393, in get_vod_connector
Feb 04 15:15:08 twitchdl python3[369]:     self.get_vod(vod_json, get_video, get_chat, vod_live)
Feb 04 15:15:08 twitchdl python3[369]:   File "/home/downloader/twitch-archiver/src/processing.py", line 667, in get_vod
Feb 04 15:15:08 twitchdl python3[369]:     raise VodDownloadError(e)
Feb 04 15:15:08 twitchdl python3[369]: src.exceptions.VodDownloadError: Video download failed. Error: invalid literal for int() with base 10: '3096-unmuted'

I am sorry to cause you so much trouble.

Brisppy commented 1 year ago

Seems the issue is to do with segments muted by automated copyright detection which are later unmuted. A fix is now available on the develop branch with 26145e5564be94cee7086700b9592f9e4c64e833.

I am sorry to cause you so much trouble.

No worries, the more issues people find and report, the better I can make this tool for everyone :D