tubearchivist / tubearchivist

Your self hosted YouTube media server
https://www.tubearchivist.com
GNU General Public License v3.0
5.21k stars 228 forks source link

Question/Feature Request - Subtitle download & indexing #59

Closed TheOneValen closed 2 years ago

TheOneValen commented 3 years ago

Are subtitles downloaded and indexed? Then one could search by what was said in a video.

bbilly1 commented 3 years ago

Giving a whole new meaning to "full text search", hmm! :-)

But no, they aren't currently indexed. I guess usefulness varies, some of these auto generated subtitles are quite bad, but when the uploader adds them, that's usually the script they are using anyways, so that could be very nice for searching.

This will add quite some text, but Elasticserach shines in this regard. Definitely putting a pin in that, thanks for the idea!

TheOneValen commented 3 years ago

Even when the subtitles are bad (one could use fuzzy search?), you could jump directly to the time mark and hear yourself. I don't think even youtube offers that feature (does it?).

Elasticsearch should be perfect for that, true :)

gonjat commented 3 years ago

I second this feature.. My Wife likes Asian Dramas and having ENG subs would be amazing. Thank you.

dfein38347g commented 2 years ago

Subtitles are the one killer feature I'm waiting for. I think the download client is capable. Is it possible to enable turning that "switch" on and integrate it into the search engine later?

bbilly1 commented 2 years ago

OK, we have subtitles in v0.1.1 now. Please update. :-)

TheOneValen commented 2 years ago

Thank you for your great work! Choosing the incremental approach is a good idea, all of what the feature request entailed is too big a Chunk. Should I open a follow-up issue for the search?

Is it possible to download the subtitles for existing videos?

bbilly1 commented 2 years ago

Searching is planned, probably together with some other search improvements. It's happening at some point.

For existing videos they are downloaded with the refreshing task that's already setup. I was thinking to also put something into the rescan filesystem functionality as well.

PAHXO commented 2 years ago

Loving this feature espcially the searching!

Although, I'm not sure if there's a standard to subtitle file name format, but Jellyfin reccomends this. Currently, TubeArchivest uses "-lang.format" i.e ("-en.vtt") format instead of ".lang.format" (".en.vtt").

Extarys commented 2 years ago

Offtopic: @PAHXO Since you are mentioning Jellyfin, have you found a way to display the TubeArchivist videos in Jellyfin? I tried using a TV Series type library and it work-ish, but since all videos are in a single folder I need a lot of manual work to display them... better.

PAHXO commented 2 years ago

Offtopic: @PAHXO Since you are mentioning Jellyfin, have you found a way to display the TubeArchivist videos in Jellyfin? I tried using a TV Series type library and it work-ish, but since all videos are in a single folder I need a lot of manual work to display them... better.

I just map Tube Archivist media folder to Jellyfin as 'other' and name it "YouTube". Then, each channel would have it's own sub-folder automatically, and it works just fine just no playlists support.

Tho, for some reason when a channel is first added to Jellyfin it would recognize it as a Series(Sometimes with multiple seasons). Then, after the download it would become a folder (which is best, I think, for now).

Note* You have to embed the metadata and thumbnails so that everything becomes a bit cleaner.

Extarys commented 2 years ago

Offtopic: @PAHXO Since you are mentioning Jellyfin, have you found a way to display the TubeArchivist videos in Jellyfin? I tried using a TV Series type library and it work-ish, but since all videos are in a single folder I need a lot of manual work to display them... better.

I just map Tube Archivist media folder to Jellyfin as 'other' and name it "YouTube". Then, each channel would have it's own sub-folder automatically, and it works just fine just no playlists support.

Tho, for some reason when a channel is first added to Jellyfin it would recognize it as a Series(Sometimes with multiple seasons). Then, after the download it would become a folder (which is best, I think, for now).

Note* You have to embed the metadata and thumbnails so that everything becomes a bit cleaner.

Yes! As soon as I saw the option I enabled it and configured Jellyfin to read the title from the metadata instead of the filename :wink:

Thanks! Gonna do it like this and give it a try! :heart:

dfein38347g commented 2 years ago

How are the subtitles stored? I don't see them when viewing the videos in emby or jellyfin.

bbilly1 commented 2 years ago

@dfein38347g There was an issue from yt-dlp side for auto generated subtitles, v0.1.3 has a workaround for that. subtitle files will get stored next to your video media files.