masperro / httplib2

Automatically exported from code.google.com/p/httplib2
0 stars 0 forks source link

httplib2 response objects do not preserve semantics of headers #144

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
See RFC 2616 sec 4.2:

   Multiple message-header fields with the same field-name MAY be
   present in a message if and only if the entire field-value for that
   header field is defined as a comma-separated list [i.e., #(values)].
   It MUST be possible to combine the multiple header fields into one
   "field-name: field-value" pair, without changing the semantics of the
   message, by appending each subsequent field-value to the first, each
   separated by a comma. The order in which header fields with the same
   field-name are received is therefore significant to the
   interpretation of the combined field value, and thus a proxy MUST NOT
   change the order of these field values when a message is forwarded.

In many cases (cookies, user-agents, etc.), the second condition is not 
satisfied - it's *not* possible to combine the header fields by commas without 
changing the semantics of the message.

The response object returned by httplib.Http.request seems to only provide 
headers in a dict and thus always merge headers with commas. This changes the 
semantics of some fields and generally makes the fields hard/impossible to 
parse back out - in this case, set-cookie becomes much harder to parse:

In [30]: r=p.urlopen('GET','http://yahoo.com',assert_same_host=0
In [31]: r.headers
Out[31]:
{'age': '0',
 'cache-control': 'private',
 'content-type': 'text/html;charset=utf-8',
 'date': 'Thu, 21 Apr 2011 00:55:45 GMT',
 'p3p': 'policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"',
 'proxy-connection': 'close',
 'server': 'YTS/1.20.0',
 'set-cookie': 'IU=deleted; expires=Wed, 21-Apr-2010 00:55:44 GMT; path=/; domain=.yahoo.com, PH=deleted; expires=Wed, 21-Apr-2010 00:55:44 GMT; path=/; domain=.yahoo.com, fpc=d=pi.1EKzrr7h.qnCL0GSYQsbWNMzCbALbMJpX8sNSrkPO0_V2C2L_4JEBGdEaCcaSHatBI5sd38oe8CPxkaQL_DArs5g2980Do4VsUvWb3zZ7uXRpAGoAjZb_dyDUEEh6DSw9AeZzv4NHMLBtbNnJscP5Igo2BEeC7Ap_37CndycK6mnzz0WF3cS1jwl9m7hbhVY6TNg-&v=2; expires=Fri, 20-Apr-2012 00:55:45 GMT; path=/; domain=www.yahoo.com, FPCK3=AgBNr4AQAHcwEABSYBAAbiAQAE46EAB/IxAASGY=; expires=Sat, 21-May-2011 00:55:45 GMT; path=/; domain=www.yahoo.com, CH=deleted; expires=Wed, 21-Apr-2010 00:55:44 GMT; path=/; domain=www.yahoo.com, CH=AgBNr4AQABzYEAAMixAAApEQAA8UEAAT6BAABQ8QACjJEAAlahAAJpkQACVn; expires=Sat, 21-May-2011 00:55:45 GMT; path=/; domain=.yahoo.com, fpt=d=BGQHiMbXetYNKhaFcjniISJw0DcYyBBG3kuTHNqDWoJ3CNvioV9lRXNwJrb7WTiXWtUnUsYPhalpO1z7LRfGISw9AZFvTxjQ6iOSyNPpkw2dyjkmUtvZt2Hcv4wdw24EQraV0yOy8dy3Ur.DRiBINQ52DAHJTJE7zluLZ_cywhyjq5gsM4nBBS7myYNbkwRlMTTEAA1QJFHHgEk5tTxF2wUgiydcRDzrYF2v5U4a1CwESQIhKNYeOMVR_EKe691WaE.0GbISikrwBNC3tUZSZCk6ovUpODiA6x0qhjzH9fs3K1.072ypzjh15gP0S_9w4IqV1qZGYxAescnaQx5vH4JBX2xTVkTJEEDCuVPZW4dLrgC4B0hBks86sWKvRLFNL4bj.n7p9bYJqjaGKWcb6pmTM4oWqB4YSrOrHGPahZLjo2ZUuaFKEo00.G4atk8.7Ltc1OycZ9AwFXVqV62YzbQ6Kt._VkiDrzQpU2EcMkjpROSP&v=1; path=/; domain=www.yahoo.com, fpps=deleted; expires=Wed, 21-Apr-2010 00:55:44 GMT; path=/; domain=www.yahoo.com, fpc_s=d=0KHyRJTrr7hdSxpVIDT9gziU5OhZ2KbR8Z9Ph0fu27ytTgFZTcO499Adn4.3QKOA6bShUa.OV.lm1zh28Lwspy6DpbaMgJh.gYu7OmzQKdBZ8fRFJVX1ZrpK2IcliwPkPUrGrYDo4T3Emc_fiv6kZ_.5M31dwMBgWi5Ki3WArfwNDXUbJivmrJwglcUucTQOSsJsujCrmeZjU5W4K6_tyaxUDOrqLQVND8tsaMTlvkCTKMRmfpp5v2gorSXt3ncgEyaVlIUoz1vkOxsjwwTimQ4kenwwbrYk4KMUstVP7peV.v_UUMldL6Waw1w-&v=2; path=/; domain=www.yahoo.com',
 'vary': 'Accept-Encoding',
 'via': '1.0 localhost (squid/3.0.STABLE19)',
 'x-cache': 'MISS from localhost',
 'x-cache-lookup': 'MISS from localhost:3128'}

$ wget --save-headers -O- http://yahoo.com|less
--2011-04-20 17:56:14--  http://yahoo.com/
Resolving yahoo.com... 209.191.122.70, 67.195.160.76, 69.147.125.65, ...
Connecting to yahoo.com|209.191.122.70|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.yahoo.com/ [following]
--2011-04-20 17:56:14--  http://www.yahoo.com/
Resolving www.yahoo.com... 72.30.2.43, 98.137.149.56
Connecting to www.yahoo.com|72.30.2.43|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `STDOUT'

    [<=>                                    ] 0           --.-K/s              HTTP/1.0 200 OK
Date: Thu, 21 Apr 2011 00:56:14 GMT
P3P: policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV 
TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL 
UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"
Cache-Control: private
Set-Cookie: IU=deleted; expires=Wed, 21-Apr-2010 00:56:13 GMT; path=/; 
domain=.yahoo.com
Set-Cookie: PH=deleted; expires=Wed, 21-Apr-2010 00:56:13 GMT; path=/; 
domain=.yahoo.com
Set-Cookie: 
fpc=d=vNz4dirUr7jkvAEhKO_kPIb20d9R08Yek91QTSyy18y0uhNaV.q8bD87mX6UqpEAaYBRaqzeNJ
vbljaO9WKbhdd0qNcJOAKFxad9AWhWMeANopea03ZKCRTirfJWt6dRnQIeiOLg5KuT4caKM8aRZoBFI.
fxli1sPvO9qDYqKdFSXhn_Wh9j4ymfFcJiyBNmQjcPhXY-&v=2; expires=Fri, 20-Apr-2012 
00:56:14 GMT; path=/; domain=www.yahoo.com
Set-Cookie: FPCK3=AgBNr4AQAHcwEABSYBAAbiAQAE46EAB/IxAASGY=; expires=Sat, 
21-May-2011 00:56:14 GMT; path=/; domain=www.yahoo.com
Set-Cookie: CH=deleted; expires=Wed, 21-Apr-2010 00:56:13 GMT; path=/; 
domain=www.yahoo.com
Set-Cookie: CH=AgBNr4AQABzYEAAMixAAApEQAA8UEAAT6BAABQ8QACjJEAAlahAAJpkQACVn; 
expires=Sat, 21-May-2011 00:56:14 GMT; path=/; domain=.yahoo.com
Set-Cookie: 
fpt=d=htnUbyfXetYLf7Tj.Nu0_yIxlRs5shCcHCtmlzcPp9AKg6jfrTha7t7VBp169CBAmn2exyQLzs
oC1Wdef3lvIN_nKFzFUqW9lUK2n9iLYzjnxAQvzgq9b6vQDKIsQZOT4SmowTUQIQvAKH.p9BbtzrCj4G
i7CQUhd0CvD6q.g410kks7RkO65gZsEox_.x9aqrF5emP2rc1ipGSdOtJ_Tr8X6wlaz9QQivs7JmM3ZH
o6mCq59ZteA0PGGuRu4wIjXyCroiFmF3H4MYaoH5IVALvzkMMOkB.zbwk6myOJF5wIhF6QQUNR48CExj
y.pkc07Zk33gqf1NJtIZDUxwv30Ys.mUoaMKKNRkaj4CzSaYfP0NCNWKz.ZpQuHQPAPMBxoAOOn.VizP
YZuBlTWGrKVqrKyk7uv3y6k61yt_7lEAWwpDHiF1vfn94ErIr_1wjRO3BK0mvUZrwnnXuEi4Mi4KPWXz
gDAwnTQr4_E6JYToZ8lTnZ&v=1; path=/; domain=www.yahoo.com
Set-Cookie: fpps=deleted; expires=Wed, 21-Apr-2010 00:56:13 GMT; path=/; 
domain=www.yahoo.com
Set-Cookie: 
fpc_s=d=z.EgnVLUr7iUlAyYSD5Vs_EiTl6PAuMK.HsqUHJoRluJOuPICjp9O4s3hhqdS7IGQUL6EDRk
b_5nKNmdd7qvchlF_pFSIoBNx7chM5VMblirJBAZJo0kBrR7vO08d84rqQ7IBMHU_a_Beh4UmvH4Cd07
EgUo1f0GWu.hUm7tTYEAc.hYD2CVrlUpu_EPk18dOd0SLblSzsnfFWbCufbVeDsAwAoayQMS3tJKQ_s4
B4bzwdj6zhapIc036D_M._gD6E9ytjmejwpDvnza6A5MlOgAPuNrp5ze0agrroVoeEZTNn3n_77UW2.i
6j8-&v=2; path=/; domain=www.yahoo.com
Vary: Accept-Encoding
Content-Type: text/html;charset=utf-8
Age: 0
Connection: close
Server: YTS/1.20.0

Original issue reported on code.google.com by yaa...@gmail.com on 26 Apr 2011 at 6:42

GoogleCodeExporter commented 8 years ago

Original comment by joe.gregorio@gmail.com on 6 Jun 2011 at 8:17