can we add all of crunchyroll?

fe80Grau / ytdlp2STRM

A little script to serve Youtube / Twitch / Crunchyroll videos without storage it. Uses yt-dlp HTTP data throught Flask and dynamic URLs. We can use this dynamic URLs to set STRM files.

https://github.com/fe80Grau/ytdlp2STRM

MIT License

209 stars 19 forks source link

can we add all of crunchyroll? #20

Closed pixelroll closed 10 months ago

pixelroll commented 10 months ago

Hello, I wanted to know if we could add all crunchyroll at once? I'd like to add the crunchyroll catalog to my jellyfin. thanks in advance.

pixelroll commented 10 months ago

I ended up making this script to list the crunchyroll catalog:

from seleniumbase import Driver
from selenium.webdriver.common.keys import Keys
import time

# Initializing the Chrome service
driver = Driver(uc=True)

# Opening the Crunchyroll page
driver.get('https://www.crunchyroll.com/fr/videos/alphabetical')

# Waiting for the page to load completely
time.sleep(3)

# Function to check if the element is visible
def is_footer_visible():
    return driver.execute_script('''
        let footer = document.querySelector('.erc-footer');
        let rect = footer.getBoundingClientRect();
        return (
            rect.bottom <= window.innerHeight &&
            rect.top >= 0
        );
    ''')

# Scroll down until the footer is reached
body = driver.find_element("tag name", "body")
links = set()

while not is_footer_visible():
    body.send_keys(Keys.PAGE_DOWN)
    time.sleep(1)  # Waiting time for loading, adjustable if needed
    new_links = driver.execute_script('''
    let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');
    return Array.from(links)
        .filter(link => link.href.startsWith('https://www.crunchyroll.com/fr/series/') && !link.title.includes('VOSTA'))
        .map(link => link.href);
    ''')
    links.update(new_links)

# Read existing links from the file
existing_links = []
try:
    with open('liens.txt', 'r') as file:
        existing_links = [line.strip() for line in file]
except FileNotFoundError:
    pass  # If the file doesn't exist yet, continue without existing links

# Add new retrieved links if they are not already present
for link in links:
    if link not in existing_links:
        existing_links.append(link)

# Rewrite all links in the file in the same order
with open('liens.txt', 'w') as file:
    for link in existing_links:
        file.write(link + '\n')

# Closing the browser
driver.quit()

pixelroll commented 10 months ago

now I have no idea how to make crunchyroll work... when I run python3 cli.py --media crunchyroll --params direct I get the following error: yt-dlp: error: You must provide at least one URL.

fe80Grau commented 10 months ago

Hi @pixelroll , Please check whether you have a channel_list.json file inside the Crunchyroll plugin folder. The file should look like this:

The file looks like:

[
    "https://www.crunchyroll.com/es/series/G63K98PZ6/one-punch-man",
    "https://www.crunchyroll.com/es/series/GYEXQKJG6/dr-stone",
    "https://www.crunchyroll.com/es/series/GY5P48XEY/demon-slayer-kimetsu-no-yaiba",
    "https://www.crunchyroll.com/es/series/GXJHM3PK5/trigun-stampede",
    "https://www.crunchyroll.com/es/series/G6NQ5DWZ6/my-hero-academia",
    "https://www.crunchyroll.com/es/series/GY3VKX1MR/hunter-x-hunter",
    "https://www.crunchyroll.com/es/series/GRDV0019R/jujutsu-kaisen",
    "https://www.crunchyroll.com/es/series/GRMG8ZQZR/one-piece",
    "https://www.crunchyroll.com/es/series/GY9PJ5KWR/naruto",
    "https://www.crunchyroll.com/es/series/GYQ4MW246/naruto-shippuden",
    "https://www.crunchyroll.com/es/series/GR75Q020Y/boruto-naruto-next-generations",
    "https://www.crunchyroll.com/es/series/GRQ4QG4GY/gto---the-animation",
    "https://www.crunchyroll.com/es/series/GYX04955R/berserk",
    "https://www.crunchyroll.com/es/series/G6W4MEZ0R/radiant",
    "https://www.crunchyroll.com/es/series/GYQ4MKDZ6/gintama",
    "https://www.crunchyroll.com/es/series/GVDHX8JJE/black-summoner",
    "https://www.crunchyroll.com/es/series/G6JQVM3ER/case-closed-detective-conan",
    "https://www.crunchyroll.com/es/series/GRE50KV36/black-clover",
    "https://www.crunchyroll.com/es/series/G6W4QKX0R/the-rising-of-the-shield-hero",
    "https://www.crunchyroll.com/es/series/GEXH3W207/a-returners-magic-should-be-special",
    "https://www.crunchyroll.com/es/series/GYZJ43JMR/that-time-i-got-reincarnated-as-a-slime",
    "https://www.crunchyroll.com/es/series/G3KHEVMN1/tokyo-revengers"
]

On the other hand, your script is awesome. When I have a moment, I'll try it out and see if it's viable for integration.

pixelroll commented 10 months ago

here's my channel_list.json: [ "https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen", "https://www.crunchyroll.com/fr/series/GRMG8ZQZR/one-piece" ]

here's my config.json: {"strm_output_folder": "/FlixDji/Animes/", "channels_list_file": "./plugins/crunchyroll/channel_list.json", "crunchyroll_cookies_file": "", "crunchyroll_subtitle_language": "fr-FR", "crunchyroll_audio_language": "ja-JP", "proxy": "False", "proxy_url": "", "crunchyroll_auth": "browser", "crunchyroll_browser": "firefox", "crunchyroll_useragent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0", "crunchyroll_username": "", "crunchyroll_password": ""}

i'm on debian 11 and have installed firefox-esr I tested with docker, I tested with cookies too and same error. Series folders are created and last_episode.txt indicates 0. I logged on to crunchyroll with my premium account and installed the tab reloader extension to reload the page every 15 min.

And for the script, all the credit goes to the seleniumbase contributors, and there's an avenue to be explored with undetected chromedriver that would surely avoid all these cookie problems ( I also wonder if we can use flaresolverr to avoid cloudflare blocking.).

And in the idea of doing something with the script, you'll have to pay attention to the fact that the catalog is French and that I've excluded everything other than series and everything subtitled in English "Vosta".

pixelroll commented 10 months ago

I've just tested it on Windows and it works perfectly. So I thought the problem was debian, but when I tested on ubuntu I had the same problem... so linux is obviously the problem.

fe80Grau commented 10 months ago

See the latest commit. I fixed parameters in the Popen thread (in the worker class) that I forgot in previous experiments.

pixelroll commented 10 months ago

See the latest commit. I fixed parameters in the Popen thread (in the worker class) that I forgot in previous experiments.

new error : 25/11/2023 09:43:27 Running crunchyroll with ['direct'] params Preparing channel https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen Traceback (most recent call last): File "/home/paul/.local/bin/yt-dlp", line 8, in sys.exit(main()) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/init.py", line 1008, in main _exit(*variadic(_real_main(argv))) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/init.py", line 962, in _real_main with YoutubeDL(ydl_opts) as ydl: File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 678, in init self._request_director = self.build_request_director(_REQUEST_HANDLERS.values(), _RH_PREFERENCES) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 4088, in build_request_director cookiejar=self.cookiejar, File "/usr/lib/python3.10/functools.py", line 981, in get val = self.func(instance) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 4019, in cookiejar return load_cookies( File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/cookies.py", line 91, in load_cookies extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring, container=container)) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/cookies.py", line 108, in extract_cookies_from_browser return _extract_firefox_cookies(profile, container, logger) File "/home/paul/.local/lib/python3.10/site-packages/yt_dlp/cookies.py", line 133, in _extract_firefox_cookies raise FileNotFoundError(f'could not find firefox cookies database in {search_root}') FileNotFoundError: could not find firefox cookies database in /home/paul/.mozilla/firefox

pixelroll commented 10 months ago

I've managed to get it working on my debian machine but I can't access it via the other machines on the same network.

fe80Grau commented 10 months ago

For the FileNotFoundError stating 'could not find the Firefox cookies database in /home/paul/.mozilla/firefox', you can set the crunchyroll_browser variable as follows:

firefox:/path/to/your/firefox/profile

To allow access over the local network: If the issue is not related to permissions or a firewall, you can access the web interface via http://your_local_ip_address:5000/. In the general settings, edit the ytdlp2strm_host variable to set the local IP (e.g., 192.168.1.5). This will ensure all .strm files are created using this base URL.

If you cannot access http://your_local_ip_address:5000/, it may be due to permissions or a firewall issue. Check the UFW package to manage the Debian firewall.

*A quick way to check if it's a firewall issue: View your iptables (using the command iptables -L). Alternatively, run the following command directly: sudo iptables -I INPUT 1 -p tcp --dport 5000 -j ACCEPT. This rule is temporary and will be removed upon restart. For a permanent solution, research 'persistent iptables' to learn how to fully configure this.

pixelroll commented 10 months ago

For the FileNotFoundError stating 'could not find the Firefox cookies database in /home/paul/.mozilla/firefox', you can set the crunchyroll_browser variable as follows:

firefox:/path/to/your/firefox/profile

To allow access over the local network: If the issue is not related to permissions or a firewall, you can access the web interface via http://your_local_ip_address:5000/. In the general settings, edit the ytdlp2strm_host variable to set the local IP (e.g., 192.168.1.5). This will ensure all .strm files are created using this base URL.

If you cannot access http://your_local_ip_address:5000/, it may be due to permissions or a firewall issue. Check the UFW package to manage the Debian firewall.

*A quick way to check if it's a firewall issue: View your iptables (using the command iptables -L). Alternatively, run the following command directly: sudo iptables -I INPUT 1 -p tcp --dport 5000 -j ACCEPT. This rule is temporary and will be removed upon restart. For a permanent solution, research 'persistent iptables' to learn how to fully configure this.

after defining the path to the profile, a new error occurs: Running crunchyroll with ['direct'] params Preparing channel https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen Traceback (most recent call last): File "/opt/ytdlp2STRM/cli.py", line 64, in main() File "/opt/ytdlp2STRM/cli.py", line 61, in main r = eval("{}.{}.{}".format("plugins",method,"to_strm"))(*params) File "/opt/ytdlp2STRM/plugins/crunchyroll/crunchyroll.py", line 227, in to_strm f.folders().make_clean_folder( File "/opt/ytdlp2STRM/clases/folders/folders.py", line 26, in make_clean_folder os.makedirs(folder_path, exist_ok=True) File "/usr/lib/python3.10/os.py", line 225, in makedirs mkdir(name, mode) OSError: [Errno 5] Input/output error: '/FlixDji/Animes/jujutsu-kaisen/S01 - JUJUTSU KAISEN (Saison 1)'

with debian I was able to open the ports as you indicated. Now I have this error even though I specified in crunchyroll_browser firefox:/home/ytdlp2strm/.mozilla/firefox.

when I try to open http://ip:5005/crunchyroll/direct/fr_watch_G69XGG44R :

thank you very much for your patience...

pixelroll commented 10 months ago

it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.

fe80Grau commented 10 months ago

it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.

It seems that your Debian direct() function for Crunchyroll is ignoring the profile path set in the crunchyroll_browser variable.

Could you try editing the file /opt/ytdlp2strm/plugins/crunchyroll.py and remove the True on line 282 of your Debian instance? I think in your case the line 282 should read: Crunchyroll().set_auth(command)

This edit will eliminate the 'artificial' quotes that I had included to ensure "compatibility with Windows."

P.S. Check last commit. I have added your code, with slight modifications, to the experiments folder to streamline it. I'm using Pandas for managing duplicates and sorting the URL list, and I've included Math.floor in the JavaScript function to correct issues with microdecimals that were not returning True in my case.

For now, the script only generates a CSV file. You can quickly add quotes and commas using a regular expression and paste the result into channel_list.json. However, I plan to further develop the script so it can read from crunchyroll/config.json and automatically regenerate channel_list.json.

This code is currently under 'experiments' because I am developing this tool for headless servers, and Selenium is not the most suitable package for such environments. The ultimate goal is to capture the initial fetch requests from the Crunchyroll website (including access tokens and cookies) and retrieve the catalog using only the requests library.

How it works:

cd to ytdlp2strm experiments folder and install requierments if you havn't (seleniumbase and pandas)
edit experiments.py and remove # from line 4 from experiments.pixelroll import crunchyroll_catalog
come back to ytdlp2strm root folder directory and run the next command

python cli.py --media experiments.pixelroll.crunchyroll_catalog --params test

pixelroll commented 10 months ago

it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.

It seems that your Debian direct() function for Crunchyroll is ignoring the profile path set in the crunchyroll_browser variable.

Could you try editing the file /opt/ytdlp2strm/plugins/crunchyroll.py and remove the True on line 282 of your Debian instance? I think in your case the line 282 should read: Crunchyroll().set_auth(command)

This edit will eliminate the 'artificial' quotes that I had included to ensure "compatibility with Windows."

P.S. Check last commit. I have added your code, with slight modifications, to the experiments folder to streamline it. I'm using Pandas for managing duplicates and sorting the URL list, and I've included Math.floor in the JavaScript function to correct issues with microdecimals that were not returning True in my case.

For now, the script only generates a CSV file. You can quickly add quotes and commas using a regular expression and paste the result into channel_list.json. However, I plan to further develop the script so it can read from crunchyroll/config.json and automatically regenerate channel_list.json.

This code is currently under 'experiments' because I am developing this tool for headless servers, and Selenium is not the most suitable package for such environments. The ultimate goal is to capture the initial fetch requests from the Crunchyroll website (including access tokens and cookies) and retrieve the catalog using only the requests library.

How it works:

cd to ytdlp2strm experiments folder and install requierments if you havn't (seleniumbase and pandas)

edit experiments.py and remove # from line 4 from experiments.pixelroll import crunchyroll_catalog

come back to ytdlp2strm root folder directory and run the next command

python cli.py --media experiments.pixelroll.crunchyroll_catalog --params test

I removed true and now I have a new error: http://ip:5005/bin/sh:%201:%20Syntax%20error:%20%22(%22%20unexpected

for the catalog script, this returns a fide file under debian.

pixelroll commented 10 months ago

ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"

pixelroll commented 10 months ago

finally I wrote directly without going through Crunchyroll().set_auth(command) in direct : '--cookies-from-browser', '"firefox:/home/ytdlp2strm/.mozilla/firefox"', --user-agent', '"Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0"'

fe80Grau commented 10 months ago

ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"

You can commit the change and i aprove it (if you want to appear on the contributors list)

pixelroll commented 10 months ago

ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"

You can commit the change and i aprove it (if you want to appear on the contributors list)

I can't wait to see what you do with my script, good luck. I also saw that ADN: animationdigitalnetwork Animation Digital Network was in the list of sites supported by ytdlp, yet in France it is the second streaming platform after crunchyroll anime so if you have the time to watch that. I'll probably look into it but I'm not very good at coding.