stevenvachon / broken-link-checker

Find broken links, missing images, etc within your HTML.
MIT License
1.97k stars 305 forks source link

Issue with youtube urls #154

Closed casaran closed 5 years ago

casaran commented 5 years ago

Bug description I have been testing HtmlUrlChecker on this page (but the bug is not limited to this page): https://unity.com/solutions/film-animation-cinematics and for some youtube videos, the result will look like this:

{ url: { original: 'urew479-Wlw', resolved: 'https://unity.com/solutions/urew479-Wlw', ....

And throw a 404 for a normal working youtube video.

To Reproduce const blc = require('broken-link-checker');

const options = { excludedKeywords: [ 'https://www.linkedin.com', 'https://linkedin.com', ], requestMethod: 'GET', };

var htmlUrlChecker = new blc.HtmlUrlChecker(options, { link: function (result) { if (result.broken) { console.log(result); throw ''; } }, });

var baseUrl = 'https://unity.com'; htmlUrlChecker.enqueue('https://unity.com/solutions/film-animation-cinematics', baseUrl);

Expected behavior { url: { original: 'https://www.youtube.com/embed/urew479-Wlw', resolved: 'https://www.youtube.com/embed/urew479-Wlw', ......

returning 200.

Environment:

Thank you for your help.

stevenvachon commented 5 years ago

If you disable JavaScript and look at the URLs for those links, they are just the YouTube IDs. They are, in fact, broken links.

casaran commented 5 years ago

You are right. Thank you for your answer.