Open aklinker1 opened 2 years ago
yt-dlp
cannot be executed in the browser, so if we wanted to use it directly we would need a Node.js microservice dedicated to getting subtitles ⛔ yt-dlp
to see if we can do the same in JS but with only what we need (subtitles from anime-skip sites). A few observations:
FFmpeg
can extract subtitles from video files, and I also read about a really really cool WASM port of FFmpeg BUT afaik we would need to download the entire video file in the browser in order to extract the subtitles which I am not sure is possible nor practical...opensubtitles
. This has a HUGE database of subtitles in varying languages with a public API and a couple wrappers on npm:
Your consumer can query the API on its own, and download 5 subtitles per requesting IP's per 24 hours
It is a bit surprising to me that we cannot get subtitles directly from the platforms; this has the obvious downside that we need to maintain and support each platform individually, but the upside that we do not rely on a 3rd party API that may (or may not) have the subtitle file that we need. So I looked a bit further into how we might get access to the subtitles directly, starting with CR Beta.
The only other anime streaming service I pay for is Funimation, so I moved on to see if I could find subtitles in it's network requests and...
This is really great news - I suspect that unless the anime is hardsubbed, we might be able to sniff the network requests and find what we are looking for! Regardless, moved on to a fallback option, that being opensubtitles.
/xml
, we could use this with something like fast-xml-parser. They request that you contact them before using this feature to write your own software; I am confident that at most they would just ask that we have a VIP membership, which is 10 euro / year. I have no problem paying that myself if their service fits our needs perfectly. .org
.Interesting. I had a feeling getting subtitles wasn't gonna be easy. Microservices aren't out of the question, but it seems like it would be a pain to maintain, just like a JS implementation per site. As of now, the third party APIs seem like the way to go, that right?
What are your thoughts at this point? Still think this is a path worth going down, or is it still too early? I'll defer judgment to you since you've done all the research.
Microservices aren't out of the question
I'd say we can consider committing to that once we have confirmed that we can deliver some accurate timestamps once we have the subtitles...
third party APIs seem like the way to go, that right?
Yeah I intend on investigating open subtitles more while also trying to find other possibilities.
What are your thoughts at this point? Still think this is a path worth going down, or is it still too early?
I am still hopeful that we can make a great feature out of this with not too much effort, but I'll keep you updated!
Note: I just updated the research comment and am really excited to get your input on it !
Research update 1
Hmm, good to know that they're available at some endpoint via a network request. In chrome extensions, there are 2 ways to read network requests:
The web request APIs. However these have been severely limited by manifest v3's restrictions. Specifically, they can't read responses. Might not be a problem, if we see a request that ends in .vtt
and can make the request ourselves and get the captions.
https://developer.chrome.com/docs/extensions/reference/webRequest/
Using a script
block to inject JS into the page's context and overwrite the fetch
and xhr
calls. With that we can read the response for every request. That approach doesn't require additional permissions, and if we're already injecting code into the pages that make the request, we don't need to request more host permissions. The downside is that this is difficult to do and can be blocked by pages CSP headers. Not to mention privacy issues with us accessing all the responses for a page's network requests.
Since the first option requires additional permissions, the second approach would be preferred. That said, I'd like to avoid dealing with/intercepting the network requests - it will be difficult to do, hard to maintain, and it will not last long as sites adopt better CSP practices.
I'm not going to maintain that or accept PRs for that, so I'd recommend you stop going down that route. That doesn't mean this still won't work. Is the URL that loads the captions consistent? If so, can we just look that URL up and make the request ourselves?
opensubtitles.org
I'm curious how many anime they have translations for. Would it be possible to compare a sample of episodes that are available on anime skip and see if the data we store on episodes will work with their API and return actual results? So we can get an idea of what percent of episodes this approach would work for?
They request that you contact them before using this feature to write your own software
If it's viable, I'll front the costs and reach out to them. They might be a useful API for my future plans for Anime Skip as well... 😏
- The web request APIs.
Is the biggest drawback to this the web request permissions? If so, could we use optional permissions and only ask for the additional permissions if the user enables Experimental: Predict Intro Timestamps
?
I ask because this seems to be the easiest way to get the subtitles for every anime that has subtitles available. Tools like yt-dlp
try to build the URLs, while we, as the browser extension, can access them. If I am missing some complexity please call me out for it, but it seems like we can use one regex for each supported platform that is looking for the subtitle request, then we make it ourselves. We won't have access to a lot of headers via the web request API, but we only need the URL; it looks like authorization related IDs are passed as query params, e.g. crunchyroll beta URL:
https://v.vrv.co/evs/46e6d2eb853a0ce66af36372c5d1b1a5/assets/55eptj8r3io0rbh_114153.txt?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cCo6Ly92LnZydi5jby9ldnMvNDZlNmQyZWI4NTNhMGNlNjZhZjM2MzcyYzVkMWIxYTUvYXNzZXRzLzU1ZXB0ajhyM2lvMHJiaF8xMTQxNTMudHh0IiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNjQxOTg1NDUwfX19XX0_&Signature=qY6m45FImtZBK8xqFJHPmP28CUEkkxtpXnS8g6SzDhkgIAnvAwy3kpaIyJ5NPNgOHomadisu2DBpqWV2AlcUqcf~VbjH8Z4kMLbmQ7nLKlCWHOVurZu5Wx8Mm9vTWLohfbC7JH-p0r-NBmp2VxC-erIWet4PDIbUZv2De9v~fLcJUhVUnhFomhxqVe20fPxZIsjeDpP69xFU9GRCeMJey2LhevdTHj7vB~l2fk9pcg9As17FMJExcbbFg6s5jcEnkBe082ZJKd~hIo2M4um8ShriHURTY1o20uD6fHhoezegks~7gZ7YtIJaefhS2tjGNbRhXbEmVRkAMuUHVQ6qfw__&Key-Pair-Id=APKAJMWSQ5S7ZB3MF5VA
- Using a script block
I totally see your points as to why this is less than ideal, thanks for outlining all this!
I'd recommend you stop going down that route
Just to clarify, is that route
= any solution that involves listening to network requests? I still have some hope for option 1, but other than that I am perfectly happy to ditch the network requests ideas entirely!
Can we just look that URL up and make the request ourselves?
Yes they are consistent, but making them is tricky... as you can see for the CR beta above as an example. Funimation is a bit easier than CR, but there are still a couple IDs that I don't know how to get. Even if we figure it all out and are able to build them, they could change in the future and that could be super annoying. In a nutshell, we probably could with a fair amount of effort, but with this option we also get the upside that any episode of anime that has subtitles, we could get.
see if the data we store on episodes will work with their API and return actual results?
If by data we store
you mean stuff like anime name, season & episode numbers, I totally get what you mean and my initial response is yes we can get results! We can make something like the below:
https://www.opensubtitles.org/en/search/sublanguageid-eng/season-1/episode-2/moviename-attack+on+titan/xml
Then parse that response, pick a subtitle file and download.
what percent of episodes this approach would work for?
Interesting, I see how this is useful information and I will try and get some figures for you!
😏
😮
Reducing the number of clicks to contribute is important, and if you're not watching a show that has timestamps yet, it would be nice if the extension could just find and skip the intro for you.
One place we could potentially find the intro location would be the text tracks/subtitles - When there's over 90s of silence, the intro could be hidden there. To increase accuracy, we'll have to use other context clues to narrow that prediction down to be accurate (Previous intro duration, 3rd party services, looking for the show title to mark the end, etc).
For now though, lets just add a setting titled
Experimental: Predict Intro Timestamps
. When toggled, show possible time ranges that the intro somehow, whatever is easiest (notification, text overlay, etc).Once we've confirmed the potential for finding intros and we have an idea of its accuracy and how many false positives this results in, we can move onto improving the accuracy. If it doesn't seem promising, we'll scrap it.
Experimental: Predict Intro Timestamps
setting