sheodox / context.reviews

Learn Japanese by studying the words you find reading native materials!
https://context.reviews
13 stars 0 forks source link

Get phrases from ASS/SRT subtitles #45

Closed devdouglasonofre closed 2 years ago

devdouglasonofre commented 2 years ago

Improvement suggestion Extract text from ASS/SRT subtitles.

Why? For Poem/Novel/Song Lyrics, the phrase extraction works like a charm. But in the case of subtitles files, copying text from there may be cumbersome and time consuming. Having a automated subtitle to CR phrase will save a lot of time.

Use case: I want to mine the vocabulary of an anime/drama/movie straight from the subtitles.

devdouglasonofre commented 2 years ago

This library may be useful.

sheodox commented 2 years ago

I actually have another project that deals with subtitles called Jimaku Player and I've written my own parser and renderer for ASS and SRT https://github.com/sheodox/jimaku-player

I'm not sure what you use to watch anime, but if you use VRV or Crunchyroll you could use that and watch anime there with your SRT/ASS file. If you click on a subtitle it will search it on Jisho so you could use that to get your subtitles into CR. I had planned on building in better integration which skips Jisho and just adds it right to CR.

I do like the idea of supporting importing phrases from subtitles or something directly though. I'll think about it.

devdouglasonofre commented 2 years ago

Just tested the Jimaku and works wonderfully. It's like the Anacreon, but without the need to have 3 things open to start mining. Great tool.

There is a way to get the entire subtitles for said episode in the Jimaku without having to watch the entire episode? I mean, I want to mine it before watching, so with the entire subtitles parsed to plain text, I would just paste in the CR and starting mining from there.

devdouglasonofre commented 2 years ago

By the way, another suggestion: a button to delete all phrases without having to export to Anki.

sheodox commented 2 years ago

Yeah the subtitle thing is a great idea. I'll put it on my todo list.

By the way, another suggestion: a button to delete all phrases without having to export to Anki.

I actually have that already, but it's kind of hidden. If you click your profile picture in the top right corner and go to Settings, then the Stats tab, there's a delete all button! I'll probably put that in a more obvious place somewhere.

image

devdouglasonofre commented 2 years ago

Thanks for pointing out!

sheodox commented 2 years ago

@devdouglasonofre I just deployed an update that adds this functionality, try it out!!

image

Now when you're viewing the phrase list there's a + Import Phrases button that will let you choose either plain text or subtitles. If you choose subtitles you can upload a .srt/.vtt/.ass/.ssa file and it will give you a list of all the subtitles in there and you can pick what you want to add.

image

I also moved both of those buttons from the Stats page to a '...' button on the phrase list's toolbar. image

devdouglasonofre commented 2 years ago

Amazing work! Tested with an SRT file and worked like a charm!

However, when I tried with an .ass: image

This is the .ass file that I was trying to import: grisaia kaijutsu 1.txt (Converted to txt because GitHub don't allow the upload of .ass files. But the content is the same)

sheodox commented 2 years ago

Hey good catch! That subtitle file has an extra blank line in amongst the subtitles and my parser got confused. I fixed it now though so it should be able to handle that!`

devdouglasonofre commented 2 years ago

Working perfectly now! Thanks a lot for you hard work!

sheodox commented 2 years ago

Nice!! You're welcome, and thanks to you too for the good ideas!