Unicode characters in URL

python-20 / video-downloader

A python application to download videos

GNU General Public License v3.0

10 stars 10 forks source link

Unicode characters in URL #64

Open AlexPHorta opened 4 years ago

AlexPHorta commented 4 years ago

https://www.youtubé.com/watch?v=9bZkp7q19f0 (with an accented E). No problem. Is it an expected behaviour?

chonix commented 4 years ago

Should we add a catch for these too or normalize all from urls? I think we should reject those invalid urls and not bother at all with normalization.

cherylli commented 4 years ago

did it work? i thought pytube would have catched that error, being invalid edit: oh it actually works and same as youtube.com

AlexPHorta commented 4 years ago

Warning: Potential Security Risk Ahead - Mozilla Firefox_007

I believe we should treat this as an error.

RyanSamman commented 4 years ago

Pretty sure it should raise an error?

chonix commented 4 years ago

I believe we should treat this as an error.

www.xn--youtbe-6ya.com www.xn--youtub-gva.com

Still that's not a "youtube.com" domain. Even if the redirect works We should add a filter to the url input, EVEN if pytube resolves it.

I believe the filter that goes around pytube (haven't checked the code/regex) is that watch\?v=(*.+) is accepted as a valid youtube url to parse.

AlexPHorta commented 4 years ago

Some more bizarre URLs that don't get pytube's attention. I'm trying to write a test for the exceptions in YouTubeVideo init and, to be honest, having a hard time coming up with one that raises the RegexMatchException.

https://www.yoootube.com/watch?v=9bZkp7q19f0 https://www.yoootube..com/watch?v=9bZkp7q19f0 https://www.youtube.com/match?v=9bZkp7q19f0 htyps://www.youtube.com/match?v=9bZkp7q19f0

cherylli commented 4 years ago

If you want to reject all those maybe just accept proper url. But why reject them when they work? Is it a security risk to the system?

chonix commented 4 years ago

Can we put a validation BEFORE pytube? Also, perhaps you can mock the received call to youtube url parsing.

AlexPHorta commented 4 years ago

Yes, the mock. Will use it.

I just think it (the URL situation) leaves room to unexpected behaviour.

AlexPHorta commented 4 years ago

I'm working on a validator, something pre pytube.