flagbug / YoutubeExtractor

A .NET library, that allows to download videos from YouTube and/or extract their audio track (currently only for flash videos).
816 stars 374 forks source link

Stopped working #346

Open kelvinRosa opened 5 years ago

kelvinRosa commented 5 years ago

Hi, today stopped again, anyone is having this issue ?, 403.

kelvinRosa commented 5 years ago

Can't play some videos: https://www.youtube.com/watch?v=uy3KFOykGQc for example

If someone knows about regex, i saw this operation in youtubedl (r'(["\'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(', r'.sig||(?P[a-zA-Z0-9$]+)('), r'.sig||(?P[a-zA-Z0-9$]+)(', r'yt.akamaized.net/)\s||\s.?\sc\s&&\sd.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(') Maybe can help, don't know if is too much different from c# regex.

dgesc217 commented 5 years ago

I Fixed it

When i did the funcPattern Match, there wasn't only one result.

so I have appended all Matches value to val funcbody

DecipherWithVersion Method in Decipher.cs line 18 string funcPattern = @"(?!h.)" + @funcName + @"=function(\w+){.*?};"; var funcBody = Regex.Match(js, funcPattern, RegexOptions.Singleline).Value; var lines = funcBody.Split(';'); funcBody = ""; foreach (Match m in Regex.Matches(js, funcPattern, RegexOptions.Singleline)) { funcBody += m.Value +";"; } lines = funcBody.Split(';');

kelvinRosa commented 5 years ago

Did you changed some other line? can you update with your code, i tried that you said with no sucess, the video still cant play.

What about your GetHtml5PlayerVersion function, that video in question i can't play it: https://www.youtube.com/watch?v=uy3KFOykGQc

kelvinRosa commented 5 years ago

@dgesc217 its happening with some music videos.

kelvinRosa commented 5 years ago

Maybe youtubedl changed something related that too: https://github.com/ytdl-org/youtube-dl/commit/63529e935cf5f87e6080607ef9d9196fe435e092#diff-bd8242a0122c5207531954b67e6e51f0

MCrissDev commented 5 years ago

I think that commit is too long ago in order to apply to latest Youtube change

kelvinRosa commented 5 years ago

I have a custom version here, was working until yesterday

kelvinRosa commented 5 years ago

My GetHtml5PlayerVersion look like that:

private static string GetHtml5PlayerVersion(JObject json) { var regex = new Regex(@"player_ias-(.+?).js");

    string js = json["assets"]["js"].ToString();

    Match match = regex.Match(js);
    if (match.Success) return match.Result("$1");

    regex = new Regex(@"player-(.+?).js");

    return regex.Match(js).Result("$1");
}
kelvinRosa commented 5 years ago

Still works for normal youtube videos, but some music videos is not working i think is something in decipher.

MCrissDev commented 5 years ago

Seems that almost all music related videos can't work also on my end.

superfly71 commented 5 years ago

It works fine with videos that don't require Decipher. The problem is the Decipher.. In DecipherWithVersion method,

var funcName = Regex.Match(js, functNamePattern).Groups[1].Value; always returns an empty string.

funcNamePattern is obvious the problem.

kelvinRosa commented 5 years ago

This is the funcname from youtubedl: funcname = self._search_regex( (r'(["\'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(', r'.sig||(?P[a-zA-Z0-9$]+)(', r'yt.akamaized.net/)\s||\s.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?\s(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)('), jscode, 'Initial JS player signature function name', group='sig')

Looks like it working on their side.

kelvinRosa commented 5 years ago

the funcnamepattern that i have on my side is: string functNamePattern = @"(\w+)\s=\sfunction(\s(\w+)\s)\s{\s\2\s=\s\2.split(\""\"")\s;(.+)return\s\2.join(\""\"")\s}\s;";

kelvinRosa commented 5 years ago

@superfly71 I noticed that my funcName is returning "u"

kelvinRosa commented 5 years ago

Someone knows how to convert this regex from python to c#? funcname = self._search_regex( (r'(["'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(', r'.sig||(?P[a-zA-Z0-9$]+)(', r'yt.akamaized.net/)\s||\s.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?\s(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)('), jscode, 'Initial JS player signature function name', group='sig')

kelvinRosa commented 5 years ago

LoL, back to work here for me.

jzabroski commented 5 years ago

You can use RegexHero to prototype a C# regex http://regexhero.net/tester/

kelvinRosa commented 5 years ago

@jzabroski thank you, i'll take a look and try.

yoirgl commented 5 years ago

also not working for me. I'm getting "Could not find the entry function for signature deciphering." Music Box is counting on you guys. Help me bring free happiness to people :)

Yoirgl

superfly71 commented 5 years ago

@kelvinRosa Your python regex is malformed. I didn't download the source code for youtube-dl so I don;t know what's going on in there..

py
kelvinRosa commented 5 years ago

@superfly71 i saw that regex here: https://github.com/ytdl-org/youtube-dl/commit/fa4ac365f69cbd51e4c9801984ebea49a12825b7#diff-bd8242a0122c5207531954b67e6e51f0

superfly71 commented 5 years ago

@kelvinRosa I have tested all the regex's (Individually) in the link you provided but nope, none of the regex matches the current js code.

The js code (js) I'm using is: string jsUrl = string.Format("http://s.ytimg.com/yts/jsbin/player_ias-{0}.js", cipherVersion); string js = HttpHelper.DownloadString(jsUrl);

I think this link might be helpful. It doesn't give you the code you need but it does explain how youtube-dl deciphers.

https://www.quora.com/How-can-I-make-a-YouTube-video-downloader-web-application-from-scratch

The link says youtube-dl uses a Javascript interpreter.

I'm currently doing a POC using Jint https://github.com/sebastienros/jint but it doesn't have a good Javascript debugger.

superfly71 commented 5 years ago

@kelvinRosa I cloned the Python source code for youtube-dl. Downloaded PyCharm and started debugging with breakpoints.

The function with the regex's in your link does not even execute when youtbe-dl is run.

Here is the python function which is not even executed:

def _parse_sig_js(self, jscode): funcname = self._search_regex( (r'(["\'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(', r'.sig||(?P[a-zA-Z0-9$]+)(', r'yt.akamaized.net/)\s||\s.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s()?\s(?P[a-zA-Z0-9$]+)(', r'\bc\s&&\sd.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)('), jscode, 'Initial JS player signature function name', group='sig')

jsi = JSInterpreter(jscode) initial_function = jsi.extract_function(funcname) return lambda s: initial_function([s])

Try stepping through youtube-dl and see why it is working.

funcname is not even relevant now.