ripose-jp / Memento

An mpv-based video player for studying Japanese
https://ripose-jp.github.io/Memento/
GNU General Public License v2.0
460 stars 21 forks source link

[ytdl_hook] ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug #159

Closed teto closed 1 year ago

teto commented 1 year ago

I am trying to package memento for the "nix" package manager: https://github.com/NixOS/nixpkgs/pull/230786,

I am trying the youtube-dl

$ result/bin/memento
warning: queue 0x25c67d0 destroyed while proxies still attached:
  wl_registry@18 still attached
Error: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: No such interface “org.freedesktop.portal.OpenURI” on object at path /org/freedesktop/portal/desktop
Error from org.freedesktop.ScreenSaver "The name org.freedesktop.ScreenSaver was not provided by any .service files"
[ytdl_hook] ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. 
[ytdl_hook] youtube-dl failed: unexpected error occurred 
[ytdl_hook] It appears that your youtube-dl version is severely out of date. 
Failed to recognize file format.

The youtube version is youtube-dl-2021.12.17 but checking out the officiel page, it looks like it is the latest ?

NB: on startup I see 20230511_00h00m10s_grim . Where is memento looking for those dictionaries ? can I declare the locations via an environmenet variable / a flag / at compiletime ?

ripose-jp commented 1 year ago

The youtube version is youtube-dl-2021.12.17 but checking out the officiel page, it looks like it is the latest ?

youtube-dl's development has unfortunately wound down and seems to have been entirely superseded by yt-dlp. Both mpv and Memento prefer yt-dlp if it is present.

I don't bundle any dictionaries with Memento due to licensing.

Where is memento looking for those dictionaries ?

In the configuration folder in dictionaries.sqlite

can I declare the locations via an environmenet variable / a flag / at compiletime ?

No.

teto commented 1 year ago

indeed using yt-dlp fixed the first issue. Since youtube-dl doesn't seem to work anymore, maybe the warning should be "just use yt-dlp" or dont even check for youtube-dl.

I don't bundle any dictionaries with Memento due to licensing.

ha. So how does it find the dictionaires ? does it maintain a list of absolute paths in ~/.config/memento ? I am trying to have a fully declarative install of memento, ie. not having to manually install those dictionaries.

ripose-jp commented 1 year ago

maybe the warning should be "just use yt-dlp" or dont even check for youtube-dl.

I don't disagree, but the error message is generated by mpv, so you're barking up the wrong tree.

So how does it find the dictionaires ? does it maintain a list of absolute paths in ~/.config/memento ?

Here

I am trying to have a fully declarative install of memento, ie. not having to manually install those dictionaries.

I don't support this approach because by bundling a set of dictionaries you make a set of assumptions about users which may not be true. People use a wide variety of different dictionaries in different combinations because they may speak different languages fluently or be far enough advanced in their study to find Japanese-Japenese dictionaries useful. Ignorance may be a problem, but that's why the message exists, so people have somewhere to start when it comes to dictionaries.

teto commented 1 year ago

I don't disagree, but the error message is generated by mpv, so you're barking up the wrong tree.

ha sorry thank you for precision

Thanks for the link to the code that's helpful.

Just to precise what I am trying to achieve I dont want to bundle the memento package with dictionaries but with the nix ecosystem, it's possible for users to go very far in declarative systems and so I just wanted to provide a way for users to put dictionaries so that memento finds them. i am a bit swamped at the moment so it will have to wait, I have to test memento further too.

Thanks for the help !

teto commented 1 year ago

First a remark to myself, I've added the dictionaries manually and they are unpacked in

$ ls -l ~/.config/memento/res/
total 20
drwxr-xr-x 2 teto users 4096 05-23 03:32 'Innocent Corpus'
drwxr-xr-x 2 teto users 4096 05-23 03:33 'JMdict (English)'
drwxr-xr-x 2 teto users 4096 05-23 03:33 'KANJIDIC (English)'

so it's trivial to configure declaratively. It's such a cool software especially with the ability to load both subtitles out of the box from youtube !

Then I have a follow up question in terms of packaging @ripose-jp : the readme lists an OCR capability. That looks interesting but manga-ocr is not packaged in nixpkgs so I wonder if I package it as well so I would like to know how useful it is: does it scan in realtime the video to look for texts ? does it have to be paused ? would you qualify it as an experimental feature or does that work well already ?

ripose-jp commented 1 year ago

does it scan in realtime the video to look for texts ?

No. The user manually selects an area of the frame to scan.

does it have to be paused ?

Theoretically no, but triggering it unconditionally pauses the video because it's hard to select an area of text if the video is playing.

would you qualify it as an experimental feature or does that work well already ?

It is not experimental, it works fine.

Here's the reason why OCR isn't shipped in binary releases:

  1. manga-ocr's dependencies are massive, several gigabytes if I recall correctly.
  2. Memory usage increases about 10x. Playing 1080p video with OCR image Playing 1080p video without OCR image
  3. It adds a hard-dependency on Python. This isn't the worst thing in the world since if mpv is compiled with Vapoursynth support, Python is already pulled in.
  4. Bundling pip libraries in binary releases is a pain in the ass.
  5. On launch, Memento connects has to HuggingFace's servers to download the manga-ocr model. This can be a privacy concern, or plain annoying if disconnected from the internet.

OCR is useful, but it's not useful enough to justify straddling the average user with these downsides.

teto commented 1 year ago

thanks a lot you make the tradeoff crystal clear. I will package manga-ocr because I want to use https://github.com/blueaxis/Poricom . If memento's buildtime (cmake) setting to enable OCR just pulls in python, then I might enable it by default. Manga-ocr is picked up at runtime (I think ?) so users can decide to install manga-ocr or not.

ripose-jp commented 1 year ago

It is picked up a runtime, but right now Memento doesn't handle the case where manga-ocr isn't found at runtime. All the OCR features will still appear on the frontend, but they'll do nothing and not notify that user that they're doing nothing. Ideally I'd like to hide features if manga-ocr isn't found, but I never implemented that because I assumed anyone that compiled Memento with OCR support would be pretty deliberate in making sure manga-ocr is available.