jeckman / YouTube-Downloader

PHP script for downloading videos from youtube; also parsing youtube feed into RSS enclosures for podcatchers
GNU General Public License v2.0
892 stars 477 forks source link

Signature problem with some videos #9

Closed Animis09 closed 9 years ago

Animis09 commented 11 years ago

The download is forbidden for some videos because of the signature.

Example: http://www.youtube.com/watch?v=ghQvZ9IID2A

jeckman commented 11 years ago

Confirmed - doesn't download either directly or through the proxy. Not sure what the difference is between this and other videos which work

Animis09 commented 11 years ago

The download doesn't work when the video has the attribute use_cipher_signature=true

jeckman commented 11 years ago

Can you remove the attribute use_cipher_signature or set it to false, or will that just render the token invalid?

jeckman commented 11 years ago

Interesting, there is a DecryptYouTubeCypher function here http://www.codingforums.com/showthread.php?p=1342470 which might be of interest

Animis09 commented 11 years ago

It's generated by the page get_video_info?video_id= and it can't be modified. In fact, if this variable is set to true then the value of the signature must be recalculated. Moreover, the parameter is no longer $sig but $s .

// Code $avail_formats[$i]['url'] = urldecode($url) . '&signature=' . (isset($sig) ? $sig : decrypt($s));

Animis09 commented 11 years ago

Thank you but I have already tried this function and it doesn't work.

jeckman commented 11 years ago

Have you tried http://rg3.github.com/youtube-dl/ to see if they have solved it? Might be able to port over their solution if they handle these videos better

jeckman commented 11 years ago

Not certain it is the same issue but see https://github.com/rg3/youtube-dl/issues/897 looks like they have encountered this

JanetGreen commented 11 years ago

I had some issues with it as well, and not being able to access stuff. I found a decent alternative with the torch browser though. It has some kind of media grabber built in, and it has worked on all the streaming video sites I tried.

ghost commented 10 years ago

Hello, I've been working with this source closely, and here is what I've found.

When downloading videos that contain copyrights, vevo, or even audio soundtracks, it won't work. To get a step closer, add this element after the video ID: &el=vevo (vevo obviously). However when gathering the links, it's missing the &signature=, so it will not give you the file (403).

We need to find the new signature with a less-complicated solution. attribute use_cipher_signature=true is when it's not a classic file, and will need the special &sig.

barthmania commented 10 years ago

Hi,

Yes there is a problem for the Vevo videos, the signature isn't the same, they are using a JS file to modify it...

Example for this video : https://www.youtube.com/watch?v=6Cp6mKbRTQY

The JS is : http://s.ytimg.com/yts/jsbin/html5player-ima-vflrGwWV9.js Look at the bz() function.

Last week I was able to download this videos, but they have modifed it and I don't know how to do now.

Thanks in advance.

jeckman commented 10 years ago

See my comment above and link to https://github.com/rg3/youtube-dl/issues/897

The http://rg3.github.com/youtube-dl/ looks to have encountered this and found a workaround - anyone have time to take what they have learned on that project and apply it here?

ghost commented 10 years ago

Confirmed, I would like to point out that youtube-dl works 100% with Vevo, I'll try to convert the patch to PHP.


Wow, youtube-dl is an amazing source, it's very flexible and has many options..

bitnol commented 10 years ago

I have used youtube-dl and youtube-downloader both. After some analysis I found the actual algo for decrypting the ciphered signature.

Check this link: http://stackoverflow.com/questions/21510857/best-approach-to-decode-youtube-cipher-signature-using-php-or-js/21700294#21700294

wilsonschu commented 10 years ago

I found this one is much clear and easy: https://github.com/soimort/you-get/blob/develop/src/you_get/extractor/youtube.py (from https://github.com/soimort/you-get)

Here are the basic steps:

function WD(a){a=a.split("");a=a.reverse();a=a.slice(1);a=a.reverse();a=a.slice(3);a=XD(a,19);a=a.reverse();a=XD(a,35);a=XD(a,61);a=a.slice(2);return a.join("")}
function XD(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c;return a}
s=558D58CA687B9BD6B4C7E22FDB1C07395679C758489.0203EF29F3E3BD1E7A32567C9F02FE09035C837676
sig=738C53090EF20F9C76523A7E1DB3E3F927E3020.984857C97659370C1BDFD2E7C4B6DB9B786AC852
function WD($a){$a=str_split($a);$a=array_reverse($a);$a=array_slice($a,1);$a=array_reverse($a);$a=array_slice($a,3);$a=XD($a,19);$a=array_reverse($a);$a=XD($a,35);$a=XD($a,61);$a=array_slice($a,2);return implode($a);}
function XD($a,$b){$c=$a[0];$a[0]=$a[$b%count($a)];$a[$b]=$c;return $a;}
function js2php($f) {
  $f = preg_replace('/\$/', '_', $f);
  $f = preg_replace('/\}/', ';}', $f);
  $f = preg_replace('/var\s+/', '', $f);
  $f = preg_replace('/(\w+).join\(""\)/', 'implode(${1})', $f);
  $f = preg_replace('/(\w+).length/', 'count(${1})', $f);
  $f = preg_replace('/(\w+).reverse\(\)/', 'array_reverse(${1})', $f);
  $f = preg_replace('/(\w+).slice\((\d+)\)/', 'array_slice(\$${1},${2})', $f);
  $f = preg_replace('/(\w+).split\(""\)/', 'str_split(${1})', $f);
  $f = preg_replace('/\((\w+)\)/', '(\$${1})', $f);
  $f = preg_replace('/\[(\w+)/', '[\$${1}', $f);
  $f = preg_replace('/\((\w+,\d+)\)/', '(\$${1})', $f);
  $f = preg_replace('/\((\w+),(\w+)\)/', '(\$${1},\$${2})', $f);
  $f = preg_replace('/(\w+)([=\[;])/', '\$${1}${2}', $f);
  $f = preg_replace('/\$(\d+)/', '${1}', $f);
  #echo $f . "\n";
  return $f;
}
$f1='function WD(a){a=a.split("");a=a.reverse();a=a.slice(1);a=a.reverse();a=a.slice(3);a=XD(a,19);a=a.reverse();a=XD(a,35);a=XD(a,61);a=a.slice(2);return a.join("")}';
$f2='function XD(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c;return a}';
$s='558D58CA687B9BD6B4C7E22FDB1C07395679C758489.0203EF29F3E3BD1E7A32567C9F02FE09035C837676';
$code = '$a= "' . $s . '";' . js2php($f1) . js2php($f2) . '$sig=WD($a); return $sig;';
$signature = eval($code);
echo 'decipered:' . $signature;
def WD(a):
  a=list(a);a=a[::-1];a=a[1:];a=a[::-1];a=a[3:];a=XD(a,19);a=a[::-1];a=XD(a,35);a=XD(a,61);a=a[2:];return "".join(a)
global XD
def XD(a,b):
  c=a[0];a[0]=a[b%len(a)];a[b]=c;return a
sig=WD(s)
wilsonschu commented 10 years ago

Just found out that youtube changed the function today. In js console:

function fE(a){a=a.split("");a=a.slice(2);a=a.reverse();a=gE(a,39);a=gE(a,43);return a.join("")}
function gE(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c;return a}
//test a ciphered signature:
fE('28282AFC11ADF6D6D2C9CA541913FE0D50E359538C.5B3EF2AD526CBCEB7C973C092A6632738707A330')
//deciphered result:
"333A7078372366A290C379C7BECBC625DA2FE3B0.C855953E05D0EF319145AC9C2D6D6FDA11CFA282"

but the above php js2php routine still works.

simplyi commented 10 years ago

@wilsonschu thank you very much for this very detailed description. I tried your approach but js2php code fails on line $f = preg_replace('/[(\w+)/', '[\$${1}', $f);

do you know how to fix it?

if I use the JavaScript only. for Example: fE('28282AFC11ADF6D6D2C9CA541913FE0D50E359538C.5B3EF2AD526CBCEB7C973C092A6632738707A330')

and display new signature with javascript alert, then it works... I get deciphered signature displayed... the video still does not play :(

simplyi commented 10 years ago

@wilsonschu do I add the diciphered signature to a URL from url_encoded_fmt_stream_map ? do I need to urldecode any of the parameters? I get the signature, replace s= with signature= but video still does not play.... I checked functions from html5player javascript and they are correct. They are exactly the same as in your example:

function fE(a){a=a.split("");a=a.slice(2);a=a.reverse();a=gE(a,39);a=gE(a,43);return a.join("")} function gE(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c;return a}

wilsonschu commented 10 years ago

Somehow the website display removes some "\"s. I changed the tag to ``` it's now showing the correct code:

function js2php($f) { $f = pregreplace('/\$/', '', $f); $f = preg_replace('/}/', ';}', $f); $f = preg_replace('/var\s+/', '', $f); $f = preg_replace('/(\w+).join("")/', 'implode(${1})', $f); $f = preg_replace('/(\w+).length/', 'count(${1})', $f); $f = preg_replace('/(\w+).reverse()/', 'array_reverse(${1})', $f); $f = preg_replace('/(\w+).slice((\d+))/', 'array_slice(\$${1},${2})', $f); $f = preg_replace('/(\w+).split("")/', 'str_split(${1})', $f); $f = preg_replace('/((\w+))/', '(\$${1})', $f); $f = preg_replace('/[(\w+)/', '[\$${1}', $f); $f = pregreplace('/((\w+,\s\d+))/', '(\$${1})', $f); $f = pregreplace('/((\w+),\s(\w+))/', '(\$${1},\$${2})', $f); $f = preg_replace('/(\w+)([=[;])/', '\$${1}${2}', $f); $f = preg_replace('/\$(\d+)/', '${1}', $f);

echo $f . "\n";

return $f; }

On Wed, Mar 19, 2014 at 5:40 PM, simplyi notifications@github.com wrote:

@wilsonschu https://github.com/wilsonschu thank you very much for this very detailed description. I tried your approach but js2php code fails on line $f = preg_replace('/[(\w+)/', '[\$${1}', $f);

do you know how to fix it?

if I use the JavaScript only. for Example:

fE('28282AFC11ADF6D6D2C9CA541913FE0D50E359538C.5B3EF2AD526CBCEB7C973C092A6632738707A330')

and display new signature with javascript alert, then it works... I get deciphered signature displayed... the video still does not play :(

— Reply to this email directly or view it on GitHubhttps://github.com/jeckman/YouTube-Downloader/issues/9#issuecomment-38124488 .

simplyi commented 10 years ago

@wilsonschu You are the man! It is working!!!

simplyi commented 10 years ago

@wilsonschu the url_encoded_fmt_stream_map contains urls to videos of different quality. But these urls and all its parameters like "s" for example are all stored in url_encoded_fmt_stream_map as one string. How to I properly split them into a map of = ?.... So I can then do the decipher of each signature of each url? Every time I request url_encoded_fmt_stream_map it comes back from Youtube shuffled and because there are multiple "url" and multiple "s" php functions like parse_str and parse_url do not help.... Do you have piece of code that parses out urls and their respected "s" into a separate array? out of url_encoded_fmt_stream_map.

wilsonschu commented 10 years ago

parse_str($videoinfo); if ($status =='fail') { // need to deal with ciphered signature die('need to call play back source'); } if ($status != 'ok') { //wrong video_info die('invalid video_info'); }

if (isset($url_encoded_fmt_stream_map)) { $fmts = explode(',', $url_encoded_fmt_stream_map); } else { $fmts = array(); }

$videos = array(); foreach ($fmts as $fmt) { parse_str($fmt, $video); if ($video) { $videos[] = $video; } }

print_r($videos);

On Wed, Mar 19, 2014 at 10:34 PM, simplyi notifications@github.com wrote:

@wilsonschu https://github.com/wilsonschu the url_encoded_fmt_stream_map contains urls to videos of different quality. But these urls and all its parameters like "s" for example are all stored in url_encoded_fmt_stream_map as one string. How to I properly split them into a map of = ?.... So I can then do the decipher of each signature of each url? Every time I request url_encoded_fmt_stream_map it comes back from Youtube shuffled and because there are multiple "url" and multiple "s" php functions like parse_str and parse_url do not help.... Do you have piece of code that parses out urls and their respected "s" into a separate array? out of url_encoded_fmt_stream_map.

— Reply to this email directly or view it on GitHubhttps://github.com/jeckman/YouTube-Downloader/issues/9#issuecomment-38136654 .

jeckman commented 10 years ago

Can one of you please make this into a pull request?

simplyi commented 10 years ago

Sorry @jeckman I am very new to gihub interface and do not know how to do it :(

simplyi commented 10 years ago

@wilsonschu with the functions I am getting from html5player(listing below) I am able to play most of youtube videos.

function fE(a){a=a.split("");a=a.slice(2);a=a.reverse();a=gE(a,39);a=gE(a,43);return a.join("")} function gE(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c;return a}

However this does not help with velvo content. Using the above two functions I get the deciphered signature but video still does not download..... I tried using youtube-dl as command like tool and it works great. I wonder if it is possible to convert part of their code that deals with velvo content into php. I have just emailed them.. or maybe use youtube-dl to extract list of video urls with correct deciphered signature that will work.

jeckman commented 10 years ago

Well, I've tried. Took the _decipher_signature function out of youtube-dl and tried to convert to PHP. There's a branch called "newparse" that has the new attempts, but so far still no luck.

If someone has a working version with these VEVO videos, please do share it with me and I can make the necessary changes - but even following the above it is not working.

wendzee commented 10 years ago

here's a raw code that work on vevo videos unless a new html5player id comes up..

<?php
$data = getpage('https://www.youtube.com/watch?v=EHkozMIXZ8w');

preg_match('/ytplayer.config = {(.*?)};/',$data,$match);

//debug($match);

$o = json_decode('{'.$match[1].'}') ;

    $links = explode(',',$o -> args -> url_encoded_fmt_stream_map);
    foreach($links as $link)
    {
        parse_str($link,$r);
        //debug($r);
        echo '<a href="'.$r['url']."&signature=".decrypt_sig($r['s'],'en_US-vflLMtkhg').'">Itag: '.$r['itag'].'</a><br />';
    }

function decrypt_sig($s,$player_id) {

    /*  Methods / Commands for decrypting sig.
       - r  = reverse the string;
       - sN = slice from character N to the end;
       - wN = swap 0th and Nth character.
    */
    $algo = array(
        'vflNzKG7n' => 's3 r s2 r s1 r w67',              // 30 Jan 2013, untested
        'vfllMCQWM' => 's2 w46 r w27 s2 w43 s2 r',        // 15 Feb 2013, untested
        'vflJv8FA8' => 's1 w51 w52 r',                    // 12 Mar 2013, untested
        'vflR_cX32' => 's2 w64 s3',                       // 11 Apr 2013, untested
        'vflveGye9' => 'w21 w3 s1 r w44 w36 r w41 s1',    // 02 May 2013, untested
        'vflj7Fxxt' => 'r s3 w3 r w17 r w41 r s2',        // 14 May 2013, untested
        'vfltM3odl' => 'w60 s1 w49 r s1 w7 r s2 r',       // 23 May 2013
        'vflDG7-a-' => 'w52 r s3 w21 r s3 r',             // 06 Jun 2013
        'vfl39KBj1' => 'w52 r s3 w21 r s3 r',             // 12 Jun 2013
        'vflmOfVEX' => 'w52 r s3 w21 r s3 r',             // 21 Jun 2013
        'vflJwJuHJ' => 'r s3 w19 r s2',                   // 25 Jun 2013
        'vfl_ymO4Z' => 'r s3 w19 r s2',                   // 26 Jun 2013
        'vfl26ng3K' => 'r s2 r',                          // 08 Jul 2013
        'vflcaqGO8' => 'w24 w53 s2 w31 w4',               // 11 Jul 2013
        'vflQw-fB4' => 's2 r s3 w9 s3 w43 s3 r w23',      // 16 Jul 2013
        'vflSAFCP9' => 'r s2 w17 w61 r s1 w7 s1',         // 18 Jul 2013
        'vflART1Nf' => 's3 r w63 s2 r s1',                // 22 Jul 2013
        'vflLC8JvQ' => 'w34 w29 w9 r w39 w24',            // 25 Jul 2013
        'vflm_D8eE' => 's2 r w39 w55 w49 s3 w56 w2',      // 30 Jul 2013
        'vflTWC9KW' => 'r s2 w65 r',                      // 31 Jul 2013
        'vflRFcHMl' => 's3 w24 r',                        // 04 Aug 2013
        'vflM2EmfJ' => 'w10 r s1 w45 s2 r s3 w50 r',      // 06 Aug 2013
        'vflz8giW0' => 's2 w18 s3',                       // 07 Aug 2013
        'vfl_wGgYV' => 'w60 s1 r s1 w9 s3 r s3 r',        // 08 Aug 2013
        'vfl1HXdPb' => 'w52 r w18 r s1 w44 w51 r s1',     // 12 Aug 2013
        'vflkn6DAl' => 'w39 s2 w57 s2 w23 w35 s2',        // 15 Aug 2013
        'vfl2LOvBh' => 'w34 w19 r s1 r s3 w24 r',         // 16 Aug 2013
        'vfl-bxy_m' => 'w48 s3 w37 s2',                   // 20 Aug 2013
        'vflZK4ZYR' => 'w19 w68 s1',                      // 21 Aug 2013
        'vflh9ybst' => 'w48 s3 w37 s2',                   // 21 Aug 2013
        'vflapUV9V' => 's2 w53 r w59 r s2 w41 s3',        // 27 Aug 2013
        'vflg0g8PQ' => 'w36 s3 r s2',                     // 28 Aug 2013
        'vflHOr_nV' => 'w58 r w50 s1 r s1 r w11 s3',      // 30 Aug 2013
        'vfluy6kdb' => 'r w12 w32 r w34 s3 w35 w42 s2',   // 05 Sep 2013
        'vflkuzxcs' => 'w22 w43 s3 r s1 w43',             // 10 Sep 2013
        'vflGNjMhJ' => 'w43 w2 w54 r w8 s1',              // 12 Sep 2013
        'vfldJ8xgI' => 'w11 r w29 s1 r s3',               // 17 Sep 2013
        'vfl79wBKW' => 's3 r s1 r s3 r s3 w59 s2',        // 19 Sep 2013
        'vflg3FZfr' => 'r s3 w66 w10 w43 s2',             // 24 Sep 2013
        'vflUKrNpT' => 'r s2 r w63 r',                    // 25 Sep 2013
        'vfldWnjUz' => 'r s1 w68',                        // 30 Sep 2013
        'vflP7iCEe' => 'w7 w37 r s1',                     // 03 Oct 2013
        'vflzVne63' => 'w59 s2 r',                        // 07 Oct 2013
        'vflO-N-9M' => 'w9 s1 w67 r s3',                  // 09 Oct 2013
        'vflZ4JlpT' => 's3 r s1 r w28 s1',                // 11 Oct 2013
        'vflDgXSDS' => 's3 r s1 r w28 s1',                // 15 Oct 2013
        'vflW444Sr' => 'r w9 r s1 w51 w27 r s1 r',        // 17 Oct 2013
        'vflK7RoTQ' => 'w44 r w36 r w45',                 // 21 Oct 2013
        'vflKOCFq2' => 's1 r w41 r w41 s1 w15',           // 23 Oct 2013
        'vflcLL31E' => 's1 r w41 r w41 s1 w15',           // 28 Oct 2013
        'vflz9bT3N' => 's1 r w41 r w41 s1 w15',           // 31 Oct 2013
        'vfliZsE79' => 'r s3 w49 s3 r w58 s2 r s2',       // 05 Nov 2013
        'vfljOFtAt' => 'r s3 r s1 r w69 r',               // 07 Nov 2013
        'vflqSl9GX' => 'w32 r s2 w65 w26 w45 w24 w40 s2', // 14 Nov 2013
        'vflFrKymJ' => 'w32 r s2 w65 w26 w45 w24 w40 s2', // 15 Nov 2013
        'vflKz4WoM' => 'w50 w17 r w7 w65',                // 19 Nov 2013
        'vflhdWW8S' => 's2 w55 w10 s3 w57 r w25 w41',     // 21 Nov 2013
        'vfl66X2C5' => 'r s2 w34 s2 w39',                 // 26 Nov 2013
        'vflCXG8Sm' => 'r s2 w34 s2 w39',                 // 02 Dec 2013
        'vfl_3Uag6' => 'w3 w7 r s2 w27 s2 w42 r',         // 04 Dec 2013
        'vflQdXVwM' => 's1 r w66 s2 r w12',               // 10 Dec 2013
        'vflCtc3aO' => 's2 r w11 r s3 w28',               // 12 Dec 2013
        'vflCt6YZX' => 's2 r w11 r s3 w28',               // 17 Dec 2013
        'vflG49soT' => 'w32 r s3 r s1 r w19 w24 s3',      // 18 Dec 2013
        'vfl4cHApe' => 'w25 s1 r s1 w27 w21 s1 w39',      // 06 Jan 2014
        'vflwMrwdI' => 'w3 r w39 r w51 s1 w36 w14',       // 06 Jan 2014
        'vfl4AMHqP' => 'r s1 w1 r w43 r s1 r',            // 09 Jan 2014
        'vfln8xPyM' => 'w36 w14 s1 r s1 w54',             // 10 Jan 2014
        'vflVSLmnY' => 's3 w56 w10 r s2 r w28 w35',       // 13 Jan 2014
        'vflkLvpg7' => 'w4 s3 w53 s2',                    // 15 Jan 2014
        'vflbxes4n' => 'w4 s3 w53 s2',                    // 15 Jan 2014
        'vflmXMtFI' => 'w57 s3 w62 w41 s3 r w60 r',       // 23 Jan 2014
        'vflYDqEW1' => 'w24 s1 r s2 w31 w4 w11 r',        // 24 Jan 2014
        'vflapGX6Q' => 's3 w2 w59 s2 w68 r s3 r s1',      // 28 Jan 2014
        'vflLCYwkM' => 's3 w2 w59 s2 w68 r s3 r s1',      // 29 Jan 2014
        'vflcY_8N0' => 's2 w36 s1 r w18 r w19 r',         // 30 Jan 2014
        'vfl9qWoOL' => 'w68 w64 w28 r',                   // 03 Feb 2014
        'vfle-mVwz' => 's3 w7 r s3 r w14 w59 s3 r',       // 04 Feb 2014
        'vfltdb6U3' => 'w61 w5 r s2 w69 s2 r',            // 05 Feb 2014
        'vflLjFx3B' => 'w40 w62 r s2 w21 s3 r w7 s3',     // 10 Feb 2014
        'vfliqjKfF' => 'w40 w62 r s2 w21 s3 r w7 s3',     // 13 Feb 2014
        'ima-vflxBu-5R' => 'w40 w62 r s2 w21 s3 r w7 s3', // 13 Feb 2014
        'ima-vflrGwWV9' => 'w36 w45 r s2 r',              // 20 Feb 2014
        'ima-vflCME3y0' => 'w8 s2 r w52',                 // 27 Feb 2014
        'ima-vfl1LZyZ5' => 'w8 s2 r w52',                 // 27 Feb 2014
        'ima-vfl4_saJa' => 'r s1 w19 w9 w57 w38 s3 r s2', // 01 Mar 2014
        'ima-en_US-vflP9269H' => 'r w63 w37 s3 r w14 r',  // 06 Mar 2014
        'ima-en_US-vflkClbFb' => 's1 w12 w24 s1 w52 w70 s2',// 07 Mar 2014
        'ima-en_US-vflYhChiG' => 'w27 r s3',              // 10 Mar 2014
        'ima-en_US-vflWnCYSF' => 'r s1 r s3 w19 r w35 w61 s2',// 13 Mar 2014
        'en_US-vflbT9-GA' => 'w51 w15 s1 w22 s1 w41 r w43 r',// 17 Mar 2014
        'en_US-vflAYBrl7' => 's2 r w39 w43',              // 18 Mar 2014
        'en_US-vflS1POwl' => 'w48 s2 r s1 w4 w35',        // 19 Mar 2014
        'en_US-vflLMtkhg' => 'w30 r w30 w39',             // 20 Mar 2014
    );

    $method = explode(" ",$algo[$player_id]);

    foreach($method as $m)
    {
        if($m == 'r')
            $s = strrev($s);
        else if( substr($m,0,1) == 's')
            $s = substr($s, (int) substr($m,1) );
        else if( substr($m,0,1) == 'w')
            $s = swap($s, (int) substr($m,1));

        echo $m." - ".$s."<br />";

    }

    return $s;
}
function swap($a, $b) {
    $c = $a[0];
    $a[0] = $a[$b];
    $a[$b] = $c;
    return $a;
}
function getpage($url)
{
    // fetch data
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); 
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); 
    $data = curl_exec($curl);
    curl_close($curl);
    return $data;
}
function debug($str) 
{
    if(is_array($str) || is_object($str))
    {
        print '<pre>';
        print_r($str);
        print '</pre>';
    }
    else
        echo $str;
}

?>
jeckman commented 10 years ago

Looking at how frequently the algorithm changes, clearly what we need is a solution that reads the js, converts it to php, and can work even as the algorithm changes.

I tried to use the js2php function earlier in the thread but got errors on the eval, and also worry about allowing automated eval of code loaded from an external source.

I think what youtube-dl does is download the html5player itself and executes the js inside the python container?

wendzee commented 10 years ago

well we can store the list algorithm in an xml or json format perhaps then have a script run occasionally to check if new html player id is found and update the list

i have here a mock to do that but I am no expert in regex , so its a bit slow but it does the job though

<?php

$js = getpage("http://s.ytimg.com/yts/jsbin/html5player-en_US-vflLMtkhg.js");

//echo $js;
$pattern = '/this.b=!1};function (.*?)\(a\){a=a.split\(""\);(.*?);return a.join\(""\)}function (.*?)\(a,b\){var c=a\[0\];a\[0\]=a\[b%a.length\];a\[b\]=c;return a};/';

preg_match_all($pattern, $js, $match);

//debug($match);

debug($match[2]);

echo convert($match[2][0]);

function convert($str)
{
    $algo = '';
    $s = explode(";",$str);
    if(is_array($s))
    {
        foreach($s as $m)
        {
            if(strpos($m,'reverse'))
                $algo .= 'r ';
            else if(strpos($m,'slice'))
                $algo .= 's'.preg_replace('/[^\d]/ ','', $m).' ';
            else
                $algo .= 'w'.preg_replace('/[^\d]/ ','', $m).' ';
        }
    }

    return trim($algo);
}
function debug($str) 
{
    if(is_array($str) || is_object($str))
    {
        print '<pre>';
        print_r($str);
        print '</pre>';
    }
    else
        echo $str;
}
function getpage($url)
{
    // fetch data
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
    $data = curl_exec($curl);
    curl_close($curl);
    //return preg_replace('~[\r\n]+~', ' ', $data);
    return $data;
}

?>
simplyi commented 10 years ago

I needed this for iOS application and iOS has a Framework that executes JavaScript very well. So as long as I get the recent javascript functions from html5player it works well. After testing for couple of days all video play well. Even the VEVO content.

Regarding the html5player javascript functions. They change and they change quite often. I have already two different version. So if hardcode the pattern @wendzee used it will not find the function when html5player changes.... :(... but even though @wendzee thank you very much for sharing the code!... I wonder if the changes in html5payer are always unique or they switch between 3-4 javascript functions from time to time.

bitnol commented 10 years ago

@wendzee I have seen this code before but it was in perl language and it was self-improving code. I forgot the url for that page. Can you please share the url of the resource code.

wendzee commented 10 years ago

it's from youtubedown

here it is

http://www.jwz.org/hacks/youtubedown

wendzee commented 10 years ago

i made my own and its auto updating, for how long I dont know..only serves mp4 this time , host not supporting ffmpeg.

http://2825.a.hostable.me/yt2/

asifnawaz commented 10 years ago

Not able to download many videos getting 403 error

asifnawaz commented 10 years ago

@wendzee can you please send me your script it looks like signature problem is solved here

jeckman commented 10 years ago

Better, share it with me so I can integrate it into the project!

On Wed, Apr 16, 2014 at 9:15 PM, asifnawaz notifications@github.com wrote:

@wendzee https://github.com/wendzee can you please send me your script it looks like signature problem is solved here

— Reply to this email directly or view it on GitHubhttps://github.com/jeckman/YouTube-Downloader/issues/9#issuecomment-40671024 .

wendzee commented 10 years ago

here you go guys, its not a perfect script but i've been using it personally and working just fine.... just create a downloads folder , and inside it mp3, mp4, and temp folder

http://2825.a.hostable.me/yt2/yt.zip

simplyi commented 10 years ago

@wendzee thank you very much for sharing your code. May I ask you a question. In your code you construct the url to html5player JS by: http://s.ytimg.com/yts/jsbin/html5player-".$id.".js"

but I cannot figure what is the value of $id. It does not look like it is a video_id... I looked at the source file of video page on youtube and cannot find any other places where a value of $id is used. Where do you read it from?

The way I have implemented it is by using regex and grabbing the value of js key from JSON. Based on your experience Do you think it is reliable?

$html5_player_url = "https://www.youtube.com/watch?v=" . $videoId;
$html5PlayerJs = curlGet($html5_player_url);
$pattern1 = "/\"js\"\:\s*\".*\.js\"/i";
samirosoft commented 10 years ago

hi jeckman, I have found a script in two languages which seems to be working: I think the first one is with perl and the second with ruby. Can anyone convert the best one to php. Here are the links: http://www.jwz.org/hacks/youtubedown https://gist.github.com/kl/9070523 I found this in https://github.com/rb2k/viddl-rb/issues/83

bitnol commented 9 years ago

To solve cipher signature problem, I have built an API to serve decryption algo. Please visit here for more: http://gitnol.com/cipherapi/

For cipher dictionary check this example: http://www.gitnol.com/cipherapi/getAlgo.php?playerID=en_US-vflz7mN60

or this: http://www.gitnol.com/cipherapi/getAlgo.php?playerID=en_US-vflz7mN60&sigformat=42.40

Cheers :)

bitnol commented 9 years ago

Also check this cool example: http://gitnol.com/youplay/

bitnol commented 9 years ago

API moved to http://api.gitnol.com

jeckman commented 9 years ago

@bitnol any chance you'll open source that implementation? The github repo right now is just a readme. https://github.com/bitnol/CipherAPI

bitnol commented 9 years ago

@jeckman I have setup a dedicated server for delivering the Algo. At present source is not open. But I am open to suggestion and feedback.

thcolin commented 9 years ago

@bitnol Thanks for your API ! But somes signatures doesn't seems to work like : fr_FR-vfludwYc3

paulwscom commented 9 years ago

@bitnol are selling your script? pm me, let me know the price

VileTung commented 9 years ago

Thanks to @wilsonschu I've managed to get it completely working.

What I did was actually quite simple:

  • Crawl YouTube playback page source(for example, http://www.youtube.com/watch?v=6Cp6mKbRTQY) to find the "ytplayer.config" which is a json object
  • in the json object, find ['args']['url_encoded_fmt_stream_map'] which is similar to the one in video_info (I used this one for the video url too, not the get_video_info)
  • if 'url' in 'url_encoded_fmt_stream_map' has "s=..." then it has a ciphered signature,
  • also find ['assets']['js'] in json. it's the html5player: \/\/s.ytimg.com\/yts\/jsbin\/html5player-ima-en_US-vflWnCYSF.js
  • Get the javascript source for the html5player: http://s.ytimg.com/yts/jsbin/html5player-ima-en_US-vflWnCYSF.js find the decipher functions in javascript by matching regex (I altered the one given by @wilsonschu, I modified it and use: ~\w+.sig||(\w*)(e~i). This returns the function (e.g. e.sig || yq(e)), which is yq. Using that will give me the function we are looking for.

Oh btw; @thcolin can you provide a working link, so I can make sure my code is working and perhaps share is.

bitnol commented 9 years ago

@VileTung Approach is correct but you cannot always trust regex for the search of function. Even though if you find the function then also sometimes there are some dependencies which are hard to track with the regex.

VileTung commented 9 years ago

@bitnol hmm I do an extra search, to make sure I don't miss any required functions. But if YouTube will change it a lot, then my code definitely won't work, however, if they make a small change like they did a few hours ago. Then my code is working as it should be (I'm able to download a video from a given url) .. We'll see how long it keeps working :)

jeckman commented 9 years ago

Personally I'm also a bit wary of pulling in arbitrary javascript from the web, running it through a regex, and executing the results in PHP on my server.

Once you find the function using the regex, what do you do with the javascript function you've found, other than "translate" it into PHP and execute it?