kapouer / url-inspector

Get metadata about any url
MIT License
28 stars 8 forks source link

Inconsistent results on some urls #7

Closed manubb closed 8 years ago

manubb commented 8 years ago

url-inspector reports incorrect data on some urls. For examples: ./url-inspector.js http://www.lavieenbois.com/ does not find any title and returns:

{
  "url": "http://www.lavieenbois.com/",
  "mime": "text/html; charset=ISO-8859-1",
  "type": "link",
  "size": 4070,
  "icon": "http://www.lavieenbois.com/favicon.ico",
  "site": "www.lavieenbois.com",
  "ext": "html",
  "html": "<a href=\"http://www.lavieenbois.com/\">undefined</a>"
}

and ./url-inspector.js https://myspace.com/unefemmemariee/music/songs returns:

{
  "url": "https://myspace.com/unefemmemariee/music/songs",
  "mime": "text/html; charset=utf-8",
  "type": "audio",
  "size": 108104,
  "title": "UNE FEMME MARIÉE",
  "icon": "https://x.myspacecdn.com/new/common/images/favicons/favicon.ico",
  "thumbnail": "https://a2-images.myspacecdn.com/images03/31/31cc9883f6e14a18b96e4ea5a8f82a83/600x600.jpg",
  "site": "myspace",
  "ext": "html",
  "html": "<audio src=\"https://myspace.com/unefemmemariee/music/songs\"></audio>"
}

where type is audio whereas mime is "text/html".

kapouer commented 8 years ago

The first one is a bug (fixed, pending publish). The second one shows a problem but not where you expected it. Compare to this:

url-inspector https://www.youtube.com/watch?v=CtP8VABF5pk
{
  "url": "https://www.youtube.com/watch?v=CtP8VABF5pk",
  "mime": "text/html; charset=utf-8",
  "type": "video",
  "title": "Kutiman - Thru You Too - NO ONE IN THIS WORLD",
  "duration": "00:04:22",
  "site": "youtube",
  "embed": "https://www.youtube.com/embed/CtP8VABF5pk",
  "thumbnail": "https://i.ytimg.com/vi/CtP8VABF5pk/maxresdefault.jpg",
  "icon": "https://s.ytimg.com/yts/img/favicon-vflz7uhzw.ico",
  "width": "1280",
  "height": "720",
  "ext": "html",
  "html": "<iframe src=\"https://www.youtube.com/embed/CtP8VABF5pk\"></iframe>"
}

In both cases, we have audio/video types but html mime type: it means it is an embeddable html object of the given type. In the audio case, the bug is in the "html" snippet - it should be an iframe.

manubb commented 8 years ago

Great. Thanks.

kapouer commented 8 years ago

Fixed in url-inspector@1.4.9