jshttp / type-is

Infer the content-type of a request.
MIT License
228 stars 32 forks source link

typeis and html #30

Closed atrhacker closed 7 years ago

atrhacker commented 7 years ago

Hello,

I encountered the following issue with your great library and I'm wondering if this can be considered as a potential enhancement of your library

Consider the following link: http://www.npr.org/2017/06/19/532916572/maudie-paints-intimate-portrait-of-canadian-painter-maud-lewis And all the potential links from this website

when you do typeis(contentTypeHeader, ['html' ,'xhtml']) it will return false.

Why? The header returned is the following Content-Type:text/html;;charset=UTF-8

So its looks normal header no? As you can see there is an extra ; in the headers. So who is wrong ? can we consider that the header is definitely malformed (it's still a big website) or that the library is not handling that case properly?

Thanks in advance for your feedback and have a great day,

Ansekme

dougwilson commented 7 years ago

Hi @atrhacker I'm not sure what you mean; when I requested the headers for that page, it provided me with text/html; charset=UTF-8, see here:

$ curl -sI http://www.npr.org/2017/06/19/532916572/maudie-paints-intimate-portrait-of-canadian-painter-maud-lewis
HTTP/1.1 404 Not Found
Server: Apache
X-Powered-By: PHP/5.6.30
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Cache-Control: max-age=0
Expires: Sat, 24 Jun 2017 01:27:03 GMT
Date: Sat, 24 Jun 2017 01:27:03 GMT
Connection: keep-alive

Regarding the header you posted above, we just parse according to the HTTP specification: https://tools.ietf.org/html/rfc7231#section-3.1.1.1

dougwilson commented 7 years ago

Closing since I never heard back.