Open AlexPHorta opened 4 years ago
Should we add a catch for these too or normalize all from urls? I think we should reject those invalid urls and not bother at all with normalization.
did it work? i thought pytube would have catched that error, being invalid edit: oh it actually works and same as youtube.com
I believe we should treat this as an error.
Pretty sure it should raise an error?
I believe we should treat this as an error.
www.xn--youtbe-6ya.com www.xn--youtub-gva.com
Still that's not a "youtube.com" domain. Even if the redirect works We should add a filter to the url input, EVEN if pytube resolves it.
I believe the filter that goes around pytube (haven't checked the code/regex) is that watch\?v=(*.+)
is accepted as a valid youtube url to parse.
Some more bizarre URLs that don't get pytube's attention. I'm trying to write a test for the exceptions in YouTubeVideo init and, to be honest, having a hard time coming up with one that raises the RegexMatchException.
https://www.yoootube.com/watch?v=9bZkp7q19f0 https://www.yoootube..com/watch?v=9bZkp7q19f0 https://www.youtube.com/match?v=9bZkp7q19f0 htyps://www.youtube.com/match?v=9bZkp7q19f0
If you want to reject all those maybe just accept proper url. But why reject them when they work? Is it a security risk to the system?
Can we put a validation BEFORE pytube? Also, perhaps you can mock the received call to youtube url parsing.
Yes, the mock. Will use it.
I just think it (the URL situation) leaves room to unexpected behaviour.
I'm working on a validator, something pre pytube.
https://www.youtubé.com/watch?v=9bZkp7q19f0 (with an accented E). No problem. Is it an expected behaviour?