httpie / cli

šŸ„§ HTTPie CLI ā€” modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more.
https://httpie.io
BSD 3-Clause "New" or "Revised" License
33.88k stars 3.68k forks source link

Cookies not correctly updated within a --follow redirect chain #662

Closed northys closed 5 months ago

northys commented 6 years ago

Hello,

when site creates new session for user, the PHPSESSID is not overrided but both the old and the new one is sent. Please notice Cookie header in last request. I'm using httpie version 0.9.8.

Ā» http --verbose --session=/tmp/yamaha.json --form --follow POST https://www.yamaha-extranet.com/login/index email=foo@bar.baz password=bar submitform=Submit
POST /login/index HTTP/1.1
Cookie: PHPSESSID=mdslhb7u0giujsaf8itq2gm2p0
Host: www.yamaha-extranet.com
User-Agent: HTTPie/0.9.8

email=****&submitform=Submit

HTTP/1.1 302 Found
Location: /
Set-Cookie: PHPSESSID=nb7qnhkfjsdtpe8gtj797koaq7; path=/; domain=www.yamaha-extranet.com; secure
X-Powered-By: PHP/5.5.38

GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=mdslhb7u0giujsaf8itq2gm2p0; PHPSESSID=nb7qnhkfjsdtpe8gtj797koaq7
Host: www.yamaha-extranet.com
User-Agent: HTTPie/0.9.8

Session file contains only one PHPSESSID though.

{
    "__meta__": {
        "about": "HTTPie session file",
        "help": "https://httpie.org/docs#sessions",
        "httpie": "0.9.8"
    },
    "auth": {
        "password": null,
        "type": null,
        "username": null
    },
    "cookies": {
        "PHPSESSID": {
            "expires": null,
            "path": "/",
            "secure": true,
            "value": "nb7qnhkfjsdtpe8gtj797koaq7"
        }
    },
    "headers": {}
}
asifmallik commented 4 years ago

Is this issue up for grab? If so, I would like to give this an attempt

jkbrzt commented 4 years ago

@asifmallik yes, it is!

Here you can learn all about how sessions in HTTPie work:

Iā€˜d definitely start by writing a few test cases reproducing the issue.

My hunch is that the cookie assignment and saving of the session needs to happen after each individual request in the chain inside the loop; not only at the end this like we currently do:

https://github.com/jakubroztocil/httpie/blob/7ee519ef46ed0a4280f431d0c14745ed8244621f/httpie/client.py#L110-L113

asifmallik commented 4 years ago

Thank you so much! I have looked into the test suite and there's a quick question I had about it - from what I understand, for testing, httpie uses httpbin. I imagine to write a test for this issue in particular, there would need to be some way of combining /redirect-to with /cookies/set. From what I read so far from httpbin.org, this does not seem possible. Do you know whether there is any workaround?

jkbrzt commented 4 years ago

@asifmallik luckily, /cookies/set also redirects. So this is how you can reproduce it:

1. prepare a session with a cookie

$ cat test-session.json
{
    "cookies": {
        "FOO": {
            "value": "BAR"
        }
    }
}

2. call /cookies/set

$ http --follow --all --print=H --session=./test-session.json httpbin.org/cookies/set?FOO=BAZ
GET /cookies/set?FOO=BAZ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.1.0

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR; FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.1.0

It should also handle and have tests for --session-readonly (cookies should be assigned after each request, but the session file shoud not be updated).

jkbrzt commented 4 years ago

Related: #824

asifmallik commented 4 years ago

@jakubroztocil After a bit of tinkering, I think I have identified where the bug happens. It seems that httpie correctly replaces the cookie but only when outputting the headers for the redirected page, both the cookies are shown for some reason. To illustrate this, we can run:

1) http --follow --all --session=./test-session.json httpbin.org/cookies/set?FOO=BAR --print=Hb which outputs:

Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="/cookies">/cookies</a>.  If not click the link.

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "cookies": {
        "FOO": "BAR"
    }
} 

2) http --follow --all --session=./test-session.json httpbin.org/cookies/set?FOO=BAZ --print=Hb which outputs:

Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="/cookies">/cookies</a>.  If not click the link.

GET /cookies HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAR; FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "cookies": {
        "FOO": "BAZ"
    }
}

3) http --follow --all --session=./test-session.json httpbin.org/get --print=Hb which outputs:

Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: FOO=BAZ
Host: httpbin.org
User-Agent: HTTPie/2.2.0-dev

{
    "args": {},
    "headers": {
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate",
        "Cookie": "FOO=BAZ",
        "Host": "httpbin.org",
        "User-Agent": "HTTPie/2.2.0-dev",
        "X-Amzn-Trace-Id": "XXX"
    },
    "origin": "XXX",
    "url": "http://httpbin.org/get"
} 

Clearly, there is a discrepancy between what is output in 2 as a header response and what is actually saved as a header response which is reflected in 3. I have looked into the StrCLIResponse class which does not really seem to parse the header outputs from httpie. My understanding is that up until this point, there hasn't been any discrepancy between what is saved (hence reflected by a query to /cookies/get and what is actually output as header) so there was no need to parse this.

This also explains why the bug is not caught by the test test_session_update which does the above but it only checks the cookies through httpbin.org/get. So, I think the correct way to test this would be to:

1) Extend StrCLIResponse to parse the header output from the CLI response 2) For every header related test which involves querying httpbin.org/get to check the header should be complemented by adding a new assertion to also check the header output in the CLI wherever appropriate to expose possible discrepancy elsewhere.

So after this modification, this bug would be essentially caught by a modified test_session_update.

Am I understanding this issue correctly? Do you think this is a reasonable way to approach this or is there an interface already to access the header output from the CLI response?

asifmallik commented 4 years ago

Actually it might make more sense to add a header property to BaseCLIResponse instead of string one as this is not specific to string responses

jkbrzt commented 4 years ago

@asifmallik it might be a bug in requests after all:

import requests

response = requests.get('http://httpbin.org/cookies/set?A=BBB', cookies={'A': 'AAA'})

print(response.json())  
#  {'cookies': {'A': 'BBB'}}  

print(response.request.headers['Cookie'])  
# A=AAA; A=BBB  
asifmallik commented 4 years ago

@jakubroztocil I see, would you like to open an issue on requests or, should I?

jkbrzt commented 4 years ago

@asifmallik feel free to open one, but it would be good to poke around in requestsā€™s source code first for a bit to make sure we are not missing something.

asifmallik commented 4 years ago

@jakubroztocil I see thank you!

barny commented 2 years ago

I've put in a PR to fix just the problem with cookie domain not being stored in session file - hoping to publish an authentication plugin for a specific server type of authentication which needs auth stored in cookies to work acrosss sessions, so hoping to get this PR accepted. Without the fix I get endless auth loops when resuming a session where the authentication is expired.

isidentical commented 2 years ago

Thanks for the PR @barny! Unfortunately we can't accept it as of this moment, since this issue has multiple points that we'd like to fix (most importantly change how cookies are represented in the session files) to take our sessions to a next level. We are actively working on it, and hopefully it will be part of the upcomiing release.

adamtaylor13 commented 2 years ago

@isidentical Is there any chance that can merge in the meantime while a larger overhaul is being performed? I just realized all my issues with using httpie for an auth-flow that I'm working on are stymied by this issue with domains. It took a bit of digging, but when I plugged @barny's solution into the sessions.py file, it worked like magic.

... I guess I can just run with this local custom patch, but I also prefer not messing with my libraries locally šŸ˜›

isidentical commented 2 years ago

Hey @adamtaylor13! We've just released 3.1.0 with the new sessions layout, please give it a try (we still haven't made the official annonucement yet, but the package is already on PyPI/Snap/Brew) if you can!

adamtaylor13 commented 2 years ago

@isidentical Oh... Fantastic! Yeah, I just upgraded to 3.1.0 and that seemed to do the trick! Big thanks! šŸŽ‰

isidentical commented 2 years ago

Glad you liked it! @adamtaylor13