Open keybounce opened 7 years ago
@keybounce cartoonnetwork comes under https://github.com/rg3/youtube-dl/blob/master/README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free .They use dmca,so though it can supported it will be copyright infringement with Turner Sports and Entertainment Digital Network as mentioned here http://www.cartoonnetworkindia.com/trademark (cn india),your may be slightly different geographically.
CartoonNetwork is the official site for the network. It is not a license breaking "for free" place.
I can download individual episodes no problem. This is a request for playlist support, so I can just have a script fetch the shows that I want, rather than having to copy/paste each episode URL from my browser into a file first.
If you can download the videos using youtube-dl then I can partially help you. I can provide you the embed python script,this is just an extension of the normal embed code in youtube-dl.Playlist request needs lots of work.
Sadly, even after looking over that github, I cannot really figure out how to use it. Other than as an example of calling youtube-dl from another program.
ok instead save all the videos you need in text format,then run youtube-dl as "youtube-dl -a [text_file_name]"
Sadly, even after looking over that github, I cannot really figure out how to use it. Other than as an example of calling youtube-dl from another program.
okay you cant figure it out is fine.Its just mere a python program to embed youtube-dl as provided here https://github.com/rg3/youtube-dl#embedding-youtube-dl. Just expanding its capacity to meet my needs.
ok instead save all the videos you need in text format,then run youtube-dl as "youtube-dl -a [text_file_name]"
Yes, but that is what I have to do, and why I'm asking for playlist support.
To help whoever does decide to do playlist support: The episode pages have usually one, sometimes two, giant json blocks. The key you are looking for is "seoFriendly".
What I used last night for partial automation is this:
cat showUrls | while read url ; do curl -s $url | grep seoFri >> episodes; done
(showUrls is a list of URLs like http://www.cartoonnetwork.com/video/justice-league-action/episodes/index.html)
split -l 1 episodes (Put each json block in a separate file).
vim x??
Then, the following commands. NB: I don't know how to use 'sed' to insert a newline, or I'd have this whole thing in a shell script.
:s/","/",^M"/g -- break the json into lines
:1,$!grep seo -- filter out the key/values that we want
:g,^.*/vid,s,,http://www.cartoonnetwork.com/vid -- remove the "key" and fix up the URL
:g/"},{.*/s/// -- remove the end of line junk
:$s,"}];,, -- special end of line junk for the last line.
:wn -- next file
This gives me one playlist file per show, which I then shove into a per-show directory, and run youtube-dl on.
@keybounce playlist support is an lengthy process as the extractor needs to be updated as well a crawler need to be added which itself becomes an extensive process.
For now use the process you just mentioned
As far as I see a simple fix would to the parse the content of url as cat does.(extractor)
Put all the url to list,(extractor)
and do the necessarry regex :s/","/",^M"/g -- break the json into lines :1,$!grep seo -- filter out the key/values that we want :g,^.*/vid,s,,http://www.cartoonnetwork.com/vid -- remove the "key" and fix up the URL :g/"},{.*/s/// -- remove the end of line junk :$s,"}];,, -- special end of line junk for the last line. :wn -- next file
as you mentioned.
What I provided was just a quick alternative solution to your problem.You can still repurpose the script as per your liking.If problem is relating my script put a issue there for everything else put here.
I don't know how to use 'sed' to insert a newline
sed ':a;N;$!ba;s// /g\n' file
is your friend
@keybounce
Sadly, even after looking over that github, I cannot really figure out how to use it. Other than as an example of calling youtube-dl from another program.
can you point out what you were not being able to figure out there and how else to reach you other than this issue https://github.com/rg3/youtube-dl/issues/13578. So that I can prepare it likewise.
How to reach me: keybounce@gmail.com
... I went back over this thread, to find your link to your github, and it's now gone :-)
I have read the official "how to embed youtube-dl in another python program", and it makes enough sense. I just don't know python (never programmed in it, but general reading of code is general reading of code)
... and that sed statement ... if I'm reading it correctly (not sure that I am), it says:
Still, reading the sed page for the ... I've lost track of how many times, I see this now (in the s command):
A line can be split by substituting a newline character into it. To specify a newline character in the replacement string, precede it with a backslash.
... so a \
@keybounce https://github.com/siddht4/youtube_dl_embed/ is the link,anyways as you read the official embed documents too you have got the idea.
\n
is actually meant to be a newline characters as followed by most programming language too.Sort of a replace function going on there. As per your previous query you had to insert a newline,and now you need to remove that. So I can safely say you sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g' file
. Necessary documentation here https://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed
Reading this topic again made me realize we have diverged too much,so if you succed do the necessary regex and pull request.
Ahh. No, I need to insert a (many) newline -- I need to take one super long line, and break it up into multiple short lines. That was what I did not know how to do in sed before, but it looks like the answer is to substitute a \n in the middle.
Of course, the docs explaining that are split into four different locations in my sed man page (sed functions: " To embed a newline in the text, precede it with a backslash. ". Sed regular expressions: " You cannot, however, use a literal newline character in an address or in the substitute command." The "s" command: "A line can be split by substituting a newline character into it. To specify a newline character in the replacement string, precede it with a backslash." And the "y" command: "a backslash followed by an ``n'' is replaced by a newline character.").
So here is my working cartoon network playlist extraction.
#!/bin/bash
# Take a (single) cartoon network url as argument. Output a list of episodes
# Does not output the clips, if any, only the episodes.
url="$@" # Should only be $1, but in case they change url formats ...
# echo :"$url": >&2
# Embedded newline
## This version only gets the episodes
# curl -s -S "$url" | sed -n -e '/getFullEpisodes/,/return/ s/","/",\
# "/gp ' | grep seoFriendly | sed -e 's,^.*/vid,http://www.cartoonnetwork.com/vid,' -e 's/"}.*//'
## This version gets the episodes and the clips
curl -s -S "$url" | grep seoFriendly | sed -n -e 's/","/",\
"/gp ' | grep seoFriendly | sed -e 's,^.*/vid,http://www.cartoonnetwork.com/vid,' -e 's/"}.*//'
okay the script looks okay to me,may need some changes , cheers
Please follow the guide below
x
into all the boxes [ ] relevant to your issue (like that [x])Make sure you are using the latest version: run
youtube-dl --version
and ensure your version is 2017.07.02. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.Before submitting an issue make sure you have:
What is the purpose of your issue?
This is a request for series playlist support.
Url's of the form http://www.cartoonnetwork.com/video/teen-titans-go/episodes/index.html http://www.cartoonnetwork.com/video/teen-titans-go/episodes/season-4.html http://www.cartoonnetwork.com/video/ben-10/episodes/season-1.html http://www.cartoonnetwork.com/video/ben-10/index.html (Yep, that one has 4 episodes not on the current season page, go figure, and the numbers indicate that they missed a lot :-) http://www.cartoonnetwork.com/video/nexo-knights/episodes/index.html http://www.cartoonnetwork.com/video/nexo-knights/episodes/season-3.html (NB: I didn't even know that there was a 3rd season)
Etc.