sylviiu / ytdlp-pybridge

A simple python bridge that ezytdl depends on.
5 stars 0 forks source link

Avoid monkey-patching `YoutubeDL` #1

Open pukkandan opened 9 months ago

pukkandan commented 9 months ago

I noticed you are monkey patching the Youtube-DL instance in order to capture output. This method may break with future versions of yt-dlp. The correct way to do this is to pass your own logger. See https://github.com/yt-dlp/yt-dlp#adding-logger-and-progress-hook for example

https://github.com/sylviiu/ytdlp-pybridge/blob/622482c6a100cb4e1a6b51fb47a87f6a690dbc77/actions.py#L83-L85

Feel free to close the issue if you don't intend to change it. I am just letting you know since I noticed this quirk.

sylviiu commented 9 months ago

I do intend on changing this! I hate monkeypatching long-term (especially since this is built on nightly), I just never took too much time looking into how to create my own logger instance that works the way it does now.

Thank you for the reminder though 😅 I completely spaced it

edit: okay i REALLY didn't take much time; i swear i looked for any relevant docs before monkey-patching but it's all right there

sylviiu commented 9 months ago

I think I remember why I didn't create a logger instance now. The only issue I had was that I was aiming for this script to provide info & err logs the same way as you would by running the yt-dlp binary / script in the terminal (e.g., all info / debugs are sent to stderr, and stuff like --dump-single-json would be sent to stdout), which the YoutubeDL API does not do the same with.

An alternative idea I had would be to match debug logs by a regex, and redirect those to error if they match (like [info] and [error] universally) https://github.com/sylviiu/ytdlp-pybridge/blob/2e7abdc2921be285023b16e486fb1a4b6788515e/c/logger.py#L9-L16 ...would this be a good alternative?

edit: looking through yt-dlp readme, I now realize that this behavior only happens if -q is passed, and code's been updated accordingly:

returnOptions['options']['logger'] = Logger(hook, ((opt.__contains__('--quiet') or opt.__contains__('-q')) and (opt.__contains__('--verbose') or opt.__contains__('-v'))))

...where the second option defines if debugs should be redirected to error

sylviiu commented 9 months ago

On another note, it doesn't look like the output dump_single_json gets sent to the logger instance, but rather is sent to stdout itself. Is there another way of capturing this (and maybe others that could be similar)?

pukkandan commented 9 months ago

If you want 100% the same behavior, overriding ydl.to_stdout seems to be the only easy way 😞

Printing self.sanitize_info(ydl.extract_info(url)) gets you 99% there, but there are some differences in error handling. An exact implementation would be:

        for url in urls:
            try:
                res = ydl.extract_info(
                    url, force_generic_extractor=ydl.params.get('force_generic_extractor'))
            except yt_dlp.utils.UnavailableVideoError as e:
                ydl.report_error(e)
            except yt_dlp.utils.DownloadCancelled as e:
                ydl.to_screen(f'[info] {e}')
                if not ydl.params.get('break_per_url'):
                    raise
                ydl._num_downloads = 0
            else:
                if ydl.params.get('dump_single_json'):
                    ydl.post_extract(res)
                    ydl.to_stdout(json.dumps(self.sanitize_info(res)))

I'll look into how to improve this API, but there's nothing I can do immediately about this

sylviiu commented 9 months ago

Because of this repo building on main branch, I didn't want to re-implement certain functions myself (like the example provided),; I wouldn't be able to focus on this at all times, which is why I was looking for the same behavior.

Looking further into it, I noticed there's to_stdout and to_stderr, both of which call _write_string using _out_files.out, _out_files.screen or _out_files.error -- would overriding these be a little easier (or at least a little better than overriding the _write_string methods)? From what I can tell, this will still go through write_string's formatting.