dentarg / hubot-url-title

:crocodile: Returns the title when a link is posted
https://www.npmjs.com/package/hubot-url-title
4 stars 11 forks source link

Can't find title when <head> isn't present #18

Open pchaigno opened 8 years ago

pchaigno commented 8 years ago

Hubot is not responding for the following url: https://jobs.lever.co/github/c77dd56c-5c28-4a78-a100-52497c1ea10e. I haven't had time to investigate yet.

dentarg commented 8 years ago

hubot-url-title only print the title if we get an http status 200 response, that URL gives me 404 now

you can re-open if you have another that returns 200 and the title isn't displayed :)

pchaigno commented 8 years ago

There's the same issue with any page on that website actually.

dentarg commented 8 years ago

@pchaigno any example URL?

pchaigno commented 8 years ago

https://jobs.lever.co/github/

dentarg commented 8 years ago

https://jobs.lever.co/github/

Looks like that URL doesn't have any <head> tag, the document starts with <!DOCTYPE html><meta charset="utf-8"><title>GitHub</title><style>@font-face ...

pchaigno commented 8 years ago

Should we support that? I'm not sure how common that might be...

dentarg commented 8 years ago

Was about to ask that... :)

I can add that I get the same results from BeautifulSoup for Python, no title found that is.

dentarg commented 8 years ago

But our code looks more flexible, from https://github.com/dentarg/hubot-url-title/blob/v1.6.0/src/scripts/hubot-url-title.coffee#L35 we have document('head title') ..., so maybe we can do it easily?