pupuk / blog

My New Blog. Record & Share. Focus on PHP, MySQL, Javascript and Golang.
MIT License
9 stars 2 forks source link

一次小小的HTTP协议测试 #21

Open pupuk opened 5 years ago

pupuk commented 5 years ago

最近在使用Fiddler抓包工具,进行抓包分析的时候,发现Fiddler可以对相应内容,转化成16进制,并标注颜色。 如图: image

其中绿色部分应该就是HTTP Response相应头了,记得以前看HTTP协议文章的时候,有些印象: HTTP/1.1 不会对头部压缩 HTTP/1.1 的头部是ASC码

准备做个小实验验证一下:

选中绿色部分以16进制(0x)形式复制 image

用文本工具处理一下,做成一个PHP的字符串数组 image

<?php

$arr = ['48', '54', '54', '50', '2F', '31', '2E', '31', '20', '32', '30', '30', '20', '4F', '4B', '0D', '0A', '53', '65', '72', '76', '65', '72', '3A', '20', '6E', '67', '69', '6E', '78', '2F', '31', '2E', '31', '36', '2E', '30', '0D', '0A', '44', '61', '74', '65', '3A', '20', '54', '68', '75', '2C', '20', '32', '33', '20', '4D', '61', '79', '20', '32', '30', '31', '39', '20', '30', '39', '3A', '30', '36', '3A', '30', '38', '20', '47', '4D', '54', '0D', '0A', '43', '6F', '6E', '74', '65', '6E', '74', '2D', '54', '79', '70', '65', '3A', '20', '74', '65', '78', '74', '2F', '68', '74', '6D', '6C', '3B', '20', '63', '68', '61', '72', '73', '65', '74', '3D', '55', '54', '46', '2D', '38', '0D', '0A', '54', '72', '61', '6E', '73', '66', '65', '72', '2D', '45', '6E', '63', '6F', '64', '69', '6E', '67', '3A', '20', '63', '68', '75', '6E', '6B', '65', '64', '0D', '0A', '43', '6F', '6E', '6E', '65', '63', '74', '69', '6F', '6E', '3A', '20', '6B', '65', '65', '70', '2D', '61', '6C', '69', '76', '65', '0D', '0A', '56', '61', '72', '79', '3A', '20', '41', '63', '63', '65', '70', '74', '2D', '45', '6E', '63', '6F', '64', '69', '6E', '67', '0D', '0A', '58', '2D', '50', '6F', '77', '65', '72', '65', '64', '2D', '42', '79', '3A', '20', '50', '48', '50', '2F', '37', '2E', '32', '2E', '31', '38', '0D', '0A', '45', '78', '70', '69', '72', '65', '73', '3A', '20', '54', '68', '75', '2C', '20', '31', '39', '20', '4E', '6F', '76', '20', '31', '39', '38', '31', '20', '30', '38', '3A', '35', '32', '3A', '30', '30', '20', '47', '4D', '54', '0D', '0A', '43', '61', '63', '68', '65', '2D', '43', '6F', '6E', '74', '72', '6F', '6C', '3A', '20', '6E', '6F', '2D', '73', '74', '6F', '72', '65', '2C', '20', '6E', '6F', '2D', '63', '61', '63', '68', '65', '2C', '20', '6D', '75', '73', '74', '2D', '72', '65', '76', '61', '6C', '69', '64', '61', '74', '65', '0D', '0A', '50', '72', '61', '67', '6D', '61', '3A', '20', '6E', '6F', '2D', '63', '61', '63', '68', '65', '0D', '0A', '43', '6F', '6E', '74', '65', '6E', '74', '2D', '45', '6E', '63', '6F', '64', '69', '6E', '67', '3A', '20', '67', '7A', '69', '70', '0D', '0A', '0D', '0A'];

foreach ($arr as $hex) {
    echo chr(hexdec($hex));
}

运行的结过是: image

与Chrome浏览器解析出来的头部信息是一致的。 image

可以看出:HTTP/1.1的响应头是: 每行信息以"\r\n" (16进制的0D 0A)结尾 头部信息结束时,有两个回车换行,即"\r\n\r\n",以此来区分header和body

分析了一些json、html、css等请求,header结束都是以0x 0D 0A 0D 0A,给人感觉是http协议是用2个\r\n来区分header与body。

下面来分析一下body呢 cat a.html image

的后面实际是有一个换行(\n), 这是由vim编辑器自动添加的,因为a.html是我在Linux下用vim创建的

file a.html 
`a.html: UTF-8 Unicode text

HTTP/1.1协议的body部分 image

后面的黑色的:61 62 63 E8 92 B2 0A,就是utf-8编码下的:abc蒲\n image

也可以看出body结束,无需任何换行符(\n, 或者 \r\n 或者\r)

疑惑&思考

通过前面的实验,发现HTTP/1.1的ResponseHeaderBody之间,是2个\r\n,也就是2个CRLF来区分的。个人感觉这个方式,还是比较科学,也比较紧凑的,毕竟Header里面,基本都是 name:value 的键值对,而这些value的本身也不会有CRLF。 寻找资料确认一下呢,找到HTTP/1.1的 RFC文档: image 图片对应的链接:https://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6 果然是这样的。

搜索资料时,发现一些不错的文章: http://www.cleantutorials.com/html/format-of-http-request-response-header-and-body-with-example