spulec / uncurl

A library to convert curl requests to python-requests.
Apache License 2.0
607 stars 94 forks source link

Parse_context Method not able to catch all cookies #28

Closed vaseem-khan closed 5 years ago

vaseem-khan commented 5 years ago
curl_string  = 'curl "https://www.invesco.co.uk/uk/products/perpetual-income-and-growth-investment-trust-plc" -H "authority: www.invesco.co.uk" -H "cache-control: max-age=0" -H "upgrade-insecure-requests: 1" -H "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" -H "accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3" -H "accept-encoding: gzip, deflate, br" -H "accept-language: en-US,en;q=0.9" -H "cookie: VISITOR=returning; NEW_VISITOR=new; InvescoAcceptedTerms=true; InvescoAcceptedCookies=true; InvescoAcceptedCookieNotice=^\^"2019-07-19T02:35:05.258-05:00^\^"; nivid=iXYdKKgc2SX-bgohul3; _ga=GA1.3.2111614667.1563521709; visitor_id481331=117043127; visitor_id481331-hash=4fb6817866eba661fda08e792283ea7febdf047dbd6c66ece3c9362ef12510f34976c846ce3c80c14ec4e4f84861aae4538705fe; LPVID=UxMGU4YjI0YTJlYjM3NDUy; InvescoUserCookieName=investor; JSESSIONID=FB36DBB7399523ECE3F5A36C0914B22B; nisid=1; _gid=GA1.3.1472917646.1563778854; LPSID-32683207=tGw2wBB3TCOU3UTW6ZgjQw; _pk_ses.7.b8b7=*; _pk_id.7.b8b7=f78f951728c734ee.1563521711.3.1563779098.1563778856." --compressed'

import uncurl
request_context = uncurl.parse_context(curl_string)

In [16]: request_context.cookies
Out[16]: 
OrderedDict([('VISITOR', 'returning'),
             ('NEW_VISITOR', 'new'),
             ('InvescoAcceptedCookies', 'true'),
             ('InvescoAcceptedTerms', 'true')])

in above snippet lot of cookies are not being captured by parse_context method of this module. if you see above curl string can find lot of other cookies as well like InvescoAcceptedCookieNotice and other. can use this website https://curl.trillworks.com to see all cookies

vaseem-khan commented 5 years ago

I went into library to debug and found our http.cookies were having issue in parsing cookies due to extra " in one of the cookie values. Marking it as closed