Closed macmartine closed 5 years ago
That page returns just a basic html that calls an IFRAME, that's why you don't get any data. Apparently it's expecting a session cookie in order to return the whole HTML that you see in the browser:
> curl http://www.ama-pdx.org/event/virtual-reality-marketing-strategy/
Returns:
<html style="height:100%">
<head>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<meta name="format-detection" content="telephone=no">
<meta name="viewport" content="initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
</head>
<body style="margin:0px;height:100%">
<iframe src="/_Incapsula_Resource?SWUDNSAI=9&xinfo=9-67653559-0%200NNN%20RT%281552353546321%200%29%20q%280%20-1%20-1%20-1%29%20r%280%20-1%29%20B12%284%2c315%2c0%29%20U19&incident_id=1241000040126355380-254073862832586825&edet=12&cinfo=04000000" frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">Request unsuccessful. Incapsula incident ID: 1241000040126355380-254073862832586825</iframe>
</body>
</html>
Thanks for the report @macmartine and for the explanation @jschwindt
@jschwindt That's not what I see when I view source.
@jschwindt That's not what I see when I view source.
That's because the browser is sending the cookie that the site expects. For example: this request response with the whole HTML because I copied the cookie from Chrome:
curl 'http://www.ama-pdx.org/event/virtual-reality-marketing-strategy/' \ -H 'Cookie: incap_ses_1241_1673620=NSFZCcLSdCLIIrse9uo4ER3xh1wAAAAA+xcMOna9Ra7KLStuHXuFWA=='
Okay, thanks. How does Slack unfurl it then?
The site has some kind of "protection" to avoid robots (I'm guessing) and perhaps there is a way to avoid it that Slack knows about... This is what I saw the first time I tried to browse the page:
This page has a title tag and an og:title tage, and the gem still returns no titles:
http://www.ama-pdx.org/event/virtual-reality-marketing-strategy/
[4] pry(UrlAdder)> page.url => "http://www.ama-pdx.org/event/virtual-reality-marketing-strategy/" [5] pry(UrlAdder)> page.title => "" [6] pry(UrlAdder)> page.best_title => nil [7] pry(UrlAdder)> page.description => nil