Closed pixelroll closed 10 months ago
I ended up making this script to list the crunchyroll catalog:
from seleniumbase import Driver
from selenium.webdriver.common.keys import Keys
import time
# Initializing the Chrome service
driver = Driver(uc=True)
# Opening the Crunchyroll page
driver.get('https://www.crunchyroll.com/fr/videos/alphabetical')
# Waiting for the page to load completely
time.sleep(3)
# Function to check if the element is visible
def is_footer_visible():
return driver.execute_script('''
let footer = document.querySelector('.erc-footer');
let rect = footer.getBoundingClientRect();
return (
rect.bottom <= window.innerHeight &&
rect.top >= 0
);
''')
# Scroll down until the footer is reached
body = driver.find_element("tag name", "body")
links = set()
while not is_footer_visible():
body.send_keys(Keys.PAGE_DOWN)
time.sleep(1) # Waiting time for loading, adjustable if needed
new_links = driver.execute_script('''
let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');
return Array.from(links)
.filter(link => link.href.startsWith('https://www.crunchyroll.com/fr/series/') && !link.title.includes('VOSTA'))
.map(link => link.href);
''')
links.update(new_links)
# Read existing links from the file
existing_links = []
try:
with open('liens.txt', 'r') as file:
existing_links = [line.strip() for line in file]
except FileNotFoundError:
pass # If the file doesn't exist yet, continue without existing links
# Add new retrieved links if they are not already present
for link in links:
if link not in existing_links:
existing_links.append(link)
# Rewrite all links in the file in the same order
with open('liens.txt', 'w') as file:
for link in existing_links:
file.write(link + '\n')
# Closing the browser
driver.quit()
now I have no idea how to make crunchyroll work... when I run python3 cli.py --media crunchyroll --params direct I get the following error: yt-dlp: error: You must provide at least one URL.
Hi @pixelroll , Please check whether you have a channel_list.json file inside the Crunchyroll plugin folder. The file should look like this:
The file looks like:
[
"https://www.crunchyroll.com/es/series/G63K98PZ6/one-punch-man",
"https://www.crunchyroll.com/es/series/GYEXQKJG6/dr-stone",
"https://www.crunchyroll.com/es/series/GY5P48XEY/demon-slayer-kimetsu-no-yaiba",
"https://www.crunchyroll.com/es/series/GXJHM3PK5/trigun-stampede",
"https://www.crunchyroll.com/es/series/G6NQ5DWZ6/my-hero-academia",
"https://www.crunchyroll.com/es/series/GY3VKX1MR/hunter-x-hunter",
"https://www.crunchyroll.com/es/series/GRDV0019R/jujutsu-kaisen",
"https://www.crunchyroll.com/es/series/GRMG8ZQZR/one-piece",
"https://www.crunchyroll.com/es/series/GY9PJ5KWR/naruto",
"https://www.crunchyroll.com/es/series/GYQ4MW246/naruto-shippuden",
"https://www.crunchyroll.com/es/series/GR75Q020Y/boruto-naruto-next-generations",
"https://www.crunchyroll.com/es/series/GRQ4QG4GY/gto---the-animation",
"https://www.crunchyroll.com/es/series/GYX04955R/berserk",
"https://www.crunchyroll.com/es/series/G6W4MEZ0R/radiant",
"https://www.crunchyroll.com/es/series/GYQ4MKDZ6/gintama",
"https://www.crunchyroll.com/es/series/GVDHX8JJE/black-summoner",
"https://www.crunchyroll.com/es/series/G6JQVM3ER/case-closed-detective-conan",
"https://www.crunchyroll.com/es/series/GRE50KV36/black-clover",
"https://www.crunchyroll.com/es/series/G6W4QKX0R/the-rising-of-the-shield-hero",
"https://www.crunchyroll.com/es/series/GEXH3W207/a-returners-magic-should-be-special",
"https://www.crunchyroll.com/es/series/GYZJ43JMR/that-time-i-got-reincarnated-as-a-slime",
"https://www.crunchyroll.com/es/series/G3KHEVMN1/tokyo-revengers"
]
On the other hand, your script is awesome. When I have a moment, I'll try it out and see if it's viable for integration.
here's my channel_list.json: [ "https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen", "https://www.crunchyroll.com/fr/series/GRMG8ZQZR/one-piece" ]
here's my config.json: {"strm_output_folder": "/FlixDji/Animes/", "channels_list_file": "./plugins/crunchyroll/channel_list.json", "crunchyroll_cookies_file": "", "crunchyroll_subtitle_language": "fr-FR", "crunchyroll_audio_language": "ja-JP", "proxy": "False", "proxy_url": "", "crunchyroll_auth": "browser", "crunchyroll_browser": "firefox", "crunchyroll_useragent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0", "crunchyroll_username": "", "crunchyroll_password": ""}
i'm on debian 11 and have installed firefox-esr I tested with docker, I tested with cookies too and same error. Series folders are created and last_episode.txt indicates 0. I logged on to crunchyroll with my premium account and installed the tab reloader extension to reload the page every 15 min.
And for the script, all the credit goes to the seleniumbase contributors, and there's an avenue to be explored with undetected chromedriver that would surely avoid all these cookie problems ( I also wonder if we can use flaresolverr to avoid cloudflare blocking.).
And in the idea of doing something with the script, you'll have to pay attention to the fact that the catalog is French and that I've excluded everything other than series and everything subtitled in English "Vosta".
I've just tested it on Windows and it works perfectly. So I thought the problem was debian, but when I tested on ubuntu I had the same problem... so linux is obviously the problem.
See the latest commit. I fixed parameters in the Popen thread (in the worker class) that I forgot in previous experiments.
See the latest commit. I fixed parameters in the Popen thread (in the worker class) that I forgot in previous experiments.
new error :
25/11/2023 09:43:27
Running crunchyroll with ['direct'] params
Preparing channel https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen
Traceback (most recent call last):
File "/home/paul/.local/bin/yt-dlp", line 8, in
I've managed to get it working on my debian machine but I can't access it via the other machines on the same network.
For the FileNotFoundError stating 'could not find the Firefox cookies database in /home/paul/.mozilla/firefox', you can set the crunchyroll_browser variable as follows:
firefox:/path/to/your/firefox/profile
To allow access over the local network: If the issue is not related to permissions or a firewall, you can access the web interface via http://your_local_ip_address:5000/. In the general settings, edit the ytdlp2strm_host variable to set the local IP (e.g., 192.168.1.5). This will ensure all .strm files are created using this base URL.
If you cannot access http://your_local_ip_address:5000/, it may be due to permissions or a firewall issue. Check the UFW package to manage the Debian firewall.
*A quick way to check if it's a firewall issue: View your iptables (using the command iptables -L). Alternatively, run the following command directly: sudo iptables -I INPUT 1 -p tcp --dport 5000 -j ACCEPT. This rule is temporary and will be removed upon restart. For a permanent solution, research 'persistent iptables' to learn how to fully configure this.
For the FileNotFoundError stating 'could not find the Firefox cookies database in /home/paul/.mozilla/firefox', you can set the crunchyroll_browser variable as follows:
firefox:/path/to/your/firefox/profile
To allow access over the local network: If the issue is not related to permissions or a firewall, you can access the web interface via http://your_local_ip_address:5000/. In the general settings, edit the ytdlp2strm_host variable to set the local IP (e.g., 192.168.1.5). This will ensure all .strm files are created using this base URL.
If you cannot access http://your_local_ip_address:5000/, it may be due to permissions or a firewall issue. Check the UFW package to manage the Debian firewall.
*A quick way to check if it's a firewall issue: View your iptables (using the command iptables -L). Alternatively, run the following command directly: sudo iptables -I INPUT 1 -p tcp --dport 5000 -j ACCEPT. This rule is temporary and will be removed upon restart. For a permanent solution, research 'persistent iptables' to learn how to fully configure this.
after defining the path to the profile, a new error occurs:
Running crunchyroll with ['direct'] params
Preparing channel https://www.crunchyroll.com/fr/series/GRDV0019R/jujutsu-kaisen
Traceback (most recent call last):
File "/opt/ytdlp2STRM/cli.py", line 64, in
with debian I was able to open the ports as you indicated. Now I have this error even though I specified in crunchyroll_browser firefox:/home/ytdlp2strm/.mozilla/firefox.
when I try to open http://ip:5005/crunchyroll/direct/fr_watch_G69XGG44R :
thank you very much for your patience...
it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.
it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.
It seems that your Debian direct() function for Crunchyroll is ignoring the profile path set in the crunchyroll_browser variable.
Could you try editing the file /opt/ytdlp2strm/plugins/crunchyroll.py and remove the True on line 282 of your Debian instance? I think in your case the line 282 should read:
Crunchyroll().set_auth(command)
This edit will eliminate the 'artificial' quotes that I had included to ensure "compatibility with Windows."
P.S. Check last commit. I have added your code, with slight modifications, to the experiments folder to streamline it. I'm using Pandas for managing duplicates and sorting the URL list, and I've included Math.floor in the JavaScript function to correct issues with microdecimals that were not returning True in my case.
For now, the script only generates a CSV file. You can quickly add quotes and commas using a regular expression and paste the result into channel_list.json. However, I plan to further develop the script so it can read from crunchyroll/config.json and automatically regenerate channel_list.json.
This code is currently under 'experiments' because I am developing this tool for headless servers, and Selenium is not the most suitable package for such environments. The ultimate goal is to capture the initial fetch requests from the Crunchyroll website (including access tokens and cookies) and retrieve the catalog using only the requests library.
How it works:
python cli.py --media experiments.pixelroll.crunchyroll_catalog --params test
it was finally solved by reinstalling firefox via root and rebooting. For debian the problem remains, but since it works on my ubuntu it's fine with me. Thank you very much for your work.
It seems that your Debian direct() function for Crunchyroll is ignoring the profile path set in the crunchyroll_browser variable.
Could you try editing the file /opt/ytdlp2strm/plugins/crunchyroll.py and remove the True on line 282 of your Debian instance? I think in your case the line 282 should read:
Crunchyroll().set_auth(command)
This edit will eliminate the 'artificial' quotes that I had included to ensure "compatibility with Windows."
P.S. Check last commit. I have added your code, with slight modifications, to the experiments folder to streamline it. I'm using Pandas for managing duplicates and sorting the URL list, and I've included Math.floor in the JavaScript function to correct issues with microdecimals that were not returning True in my case.
For now, the script only generates a CSV file. You can quickly add quotes and commas using a regular expression and paste the result into channel_list.json. However, I plan to further develop the script so it can read from crunchyroll/config.json and automatically regenerate channel_list.json.
This code is currently under 'experiments' because I am developing this tool for headless servers, and Selenium is not the most suitable package for such environments. The ultimate goal is to capture the initial fetch requests from the Crunchyroll website (including access tokens and cookies) and retrieve the catalog using only the requests library.
How it works:
- cd to ytdlp2strm experiments folder and install requierments if you havn't (seleniumbase and pandas)
- edit experiments.py and remove # from line 4 from experiments.pixelroll import crunchyroll_catalog
- come back to ytdlp2strm root folder directory and run the next command
python cli.py --media experiments.pixelroll.crunchyroll_catalog --params test
I removed true and now I have a new error: http://ip:5005/bin/sh:%201:%20Syntax%20error:%20%22(%22%20unexpected
for the catalog script, this returns a fide file under debian.
ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"
finally I wrote directly without going through Crunchyroll().set_auth(command) in direct : '--cookies-from-browser', '"firefox:/home/ytdlp2strm/.mozilla/firefox"', --user-agent', '"Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0"'
ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"
You can commit the change and i aprove it (if you want to appear on the contributors list)
ok for the links not found it comes from the class which changes depending on the bone. so i changed the line "let links = document.querySelectorAll('a.horizontal-card-hover__link--A-RZX');" to "let links = document.querySelectorAll('a');"
You can commit the change and i aprove it (if you want to appear on the contributors list)
I can't wait to see what you do with my script, good luck. I also saw that ADN: animationdigitalnetwork Animation Digital Network was in the list of sites supported by ytdlp, yet in France it is the second streaming platform after crunchyroll anime so if you have the time to watch that. I'll probably look into it but I'm not very good at coding.
Hello, I wanted to know if we could add all crunchyroll at once? I'd like to add the crunchyroll catalog to my jellyfin. thanks in advance.