thorsten-gehrig / alexa-remote-control

control Amazon Alexa from command Line (set volume, select station from tunein or pandora)
422 stars 104 forks source link

ERROR: no CSRF cookie received #179

Closed von-Chaps closed 7 months ago

von-Chaps commented 7 months ago

Running v0.21b of the script, configured to use the REFRESH_TOKEN as obtained by the proxy login. Recently, the script has stopped working with the following output;

rapa-3:/config/scripts# ./alexa_remote_control.sh -a
cookie expired, logging in again ...
trying to get CSRF from handlebars
trying to get CSRF from devices-v2
ERROR: no CSRF cookie received
rapa-3:/config/scripts#

I have tried checking the refresh token by logging in via the proxy again. Interestingly, the token has changed - which I did not exepct - so I tried the new one. Same error.

Is there anything I can do to help diagnose this?

Many thanks.

adn77 commented 7 months ago

When you login using the alexa_cookie app, you implicitly register a new device with your Amazon account (visible in My Apps and Devices). Hence the new refresh token.

The script uses a number of URLs to retrieve the CSRF cookie from. In order to check what's happening please check the following:

  1. you should have a valid cookie file (modification date the last time you tried the script) at /tmp/.alexa.cookie which looks something like this
    
    # Netscape HTTP Cookie File
    # https://curl.se/docs/http-cookies.html
    # This file was generated by libcurl! Edit at your own risk.

HttpOnly_.amazon.de TRUE / TRUE 1712275203 sess-at-acbde "/Jpfa+/...."

HttpOnly_.amazon.de TRUE / TRUE 1712275203 at-acbde "Atza|...."

.amazon.de TRUE / TRUE 2121984003 x-acbde "____ .amazon.de TRUE / TRUE 2121984003 ubid-acbde 261-1234567-890... .amazon.de TRUE / TRUE 2121984003 session-id 260-1234567-890...

The file is missing a line:

.amazon.de TRUE / FALSE 2027718004 csrf 1234567890


The file should be writable by the user running the script. Curl updates the cookie values automtically.

2. The CSRF cookie is tried to be recieved at three different URLs please try them an see if there is a **Set-Cookie: csrf=...** in the output:
* `curl -s -v -b /tmp/.alexa.cookie https://alexa.amazon.de/api/language > /dev/null`
* `curl -s -v -b /tmp/.alexa.cookie https://alexa.amazon.de/templates/oobe/d-device-pick.handlebars > /dev/null`
* `curl -s -v -b /tmp/.alexa.cookie https://alexa.amazon.de/api/devices-v2/device?cached=false > /dev/null`
If you use a non-German Amazon account, please change the *amazon.de* string accordingly.

Please let me know what you see!
Alex
von-Chaps commented 7 months ago

Hi Alex,

Thanks for the explanation; that's what I figured.

The cookie file is present and writable (and updated each time I run the script). There is no csrf field in any of the cookies returned by those three attempts. Here is a dump of the cookie file after each attempt (edited for obvious reasons)...

rapa-3:/config/scripts# ./alexa_remote_control.sh
cookie expired, logging in again ...
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.co.uk   TRUE    /       TRUE    2147483647      x-acbuk "xxxxxxxxxxxxxxxxxxxxxxxxxx"
.amazon.co.uk   TRUE    /       TRUE    2147483647      ubid-acbuk      259-555...
.amazon.co.uk   TRUE    /       TRUE    2147483647      session-id      260-549...
trying to get CSRF from handlebars
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.co.uk   TRUE    /       TRUE    1743754286      session-id-time 2342938286l
.amazon.co.uk   TRUE    /       TRUE    1743754286      session-id      260-549...
.amazon.co.uk   TRUE    /       TRUE    1743754286      ubid-acbuk      259-555...
.amazon.co.uk   TRUE    /       TRUE    2147483647      x-acbuk "xxxxxxxxxxxxxxxxxxxxxxxxxx"
trying to get CSRF from devices-v2
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.co.uk   TRUE    /       TRUE    2147483647      x-acbuk "xxxxxxxxxxxxxxxxxxxxxxxxxx"
.amazon.co.uk   TRUE    /       TRUE    1743754286      ubid-acbuk      259-555...
.amazon.co.uk   TRUE    /       TRUE    1743754286      session-id      260-549...
.amazon.co.uk   TRUE    /       TRUE    1743754286      session-id-time 2342938286l
ERROR: no CSRF cookie received
rapa-3:/config/scripts#

It's worth noting that I have been using this script with great success from within Home Assistant for a long time and it only stopped working [I want to say] 3rd April (maybe 2nd). I have also not run any package updates (apt-get, etc) for quite a while.

adn77 commented 7 months ago

Ok, looks like you're using amazon.co.uk... I don't have an account there to test with.

Here's what else you could try.

See if there is a Set-Cookie: csrf=... in the output. The curl commands alone won't write the cookie to file.

von-Chaps commented 7 months ago

There is no Set-Cookie directive at all in the output of either of those. Nor is there one in any of the three that the original script attempts (language, handlebars, devices-v2).

von-Chaps commented 7 months ago

I suspect that whatever has changed at Amazon's end is to do with the removal of the web interface to Alexa (although I have no evidence for this). The fact that it only seems to be affecting the .uk domain at present is probably just a timing thing and this issue is probably headed everyone's way at some point. It would be a real shame if this prevented alexa-remote-control for working in the future. I am sure I am not alone in finding it extremely useful and I am rather lost without it.

I have spent quite a bit of time looking for a way to obtain the csrf token, but have had no success. I will continue looking and experimenting.

If you have any suggestions as to how I can help triage this issue, I am happy to help. I am thinking that perhaps loading the Alexa app into Android Studio might be informative.

von-Chaps commented 7 months ago

If anyone is following this, just a quick update to say that I have discovered that alexa_remote_control.sh is only broken for me when it is executed from within the Home Assistant docker environment. If I pull the script out and run it natively on the box, it works fine. So, of course, when I said nothing had changed on my machine, that was not true. The Home Assistant environment was updated recently to 2024.3.3 and that's when alexa_remote_control.sh stopped working.

I am off now to ty and nail down what exactly has happened and, more importantly, how to get it working again.

More later...

adn77 commented 7 months ago

Hmm, if the .alexa.cookie file is a current one, either of the five curl statements should get you a response with the mentioned Set-Cookie: csrf=... in its header.

Maybe the TMP environment variable is set differently in the container or the process running in the container does not have write access to the .alexa.cookie

I'm relieved though that it's working normally when running stand alone.

von-Chaps commented 7 months ago

Yea, there's more to it. The cookie file mechanism is working fine, but...

In the cookies file, two of the lines are #HttpOnly It seems that these are set by Amazon [obviously], but they are not being sent back to the server in the subsequent curl calls. This is something to do with the curl version. I have it working with curl 7.74.0, but curl 8.5.0 (shipped with the Home Assistant docker version) does not send #HttpOnly cookie tokens.

I don't currently know why this is and am trawling the curl source code.

My guess is this will break for more people as they upgrade to a newer curl. Might be a bug, might be a "feature". Not sure yet.

adn77 commented 7 months ago

I am creating the cookie file "manually" from the JSON received in the cookie exchange API call. Maybe the format of the cookie jar file has changed in curl v8.

I'll try to test later when I have proper Internet. Alex

von-Chaps commented 7 months ago

Yes, I saw that. It does feel a bit fragile, however, my cursory glance at the curl source doesn't indicate that the format has changed (it's a pretty standard format), but curl has had quite a bit of work from v7 to v8.

I just need to spin up a vm and then I'll start building different versions of the curl source to try and find exactly what is preventing curl -b from loading the HttpOnly tokens.

von-Chaps commented 7 months ago

Quick update, I think the curl rathole was a red-herring. You will recognise this though.

Cookie file is missing expiry dates (coincidentally it's only for the HttpOnly tokens as they have real expiry dates and not 2044 ones which take a different path through your toEpoch function.

rapa-3:/config/scripts# jq -r '.response.tokens.cookies | to_entries[] | .key as $domain | .value[] | .Expires'  ../tmp/.alexa.cookie.json | toEpoch
s/1 Apr 2044 16:08:13 GMT/2147483647/g
s/1 Apr 2044 16:08:13 GMT/2147483647/g
s/1 Apr 2044 16:08:13 GMT/2147483647/g
s/7 Apr 2024 16:08:13 GMT//g
s/7 Apr 2024 16:08:13 GMT//g
rapa-3:/config/scripts# date -d "7 Apr 2024 16:08:13 GMT" -u +"%s"
date: invalid date '7 Apr 2024 16:08:13 GMT'

I have yet to discover exactly what is wrong with the date utility as shipped with Home Assistant. I will continue investigating.

adn77 commented 7 months ago

I've been trying to account for all possible date variants out there https://github.com/thorsten-gehrig/alexa-remote-control/blob/ceabd042c8776e96cf609f264b7f84f303110feb/alexa_remote_control.sh#L522

Are you sure, you have the latest version of the script?

von-Chaps commented 7 months ago

Yes, I have the latest version. Read on....

So, Home Assistant version 2023.3.3 ships with only the BusyBox date utility which is in no way capable of supporting what is needed by the toEpoch function within alexa_remote_control.sh. As such, toEpoch is silently failing and not associating an expiry date/time with cookie tokens. This causes the logon to Alexa to fail.

Specifically, toEpoch calls date -d whereas in BusyBox the option is -D, but even then, it is not capable of parsing the expiry date/time format in the json.

Certainly, previous versions of Home Assistant shipped with a more capable date utility since alexa_remote_control.sh has worked well for me for a long time. I have just upgraded to Home Assistant 2024.4.1 which is the latest version and the issue remains. I expect more people will be along shortly with the same problem.

There is another more subtle problem in that I don't think Home Assistant within docker ships with timezone data installed which means that no date parsing utilities will be capable of dealing with timezones anyway.

adn77 commented 7 months ago

Can you point me at the docker image you're using.

This whole date parsing is really driving me crazy. All I wanted is actually not to hammer the Amazon auth service with unnecessary cookie-for-token exchanges. On the other hand the access token is only valid about 24h. So I might as well just have the entire cookie expire some time after that...

von-Chaps commented 7 months ago

As an interim, I thought I'd do the date parsing in Python since that can be relied upon to be present in the Home Assistant installation ;) Then I figured I might as well do the entire conversion from json to cookie format in that Python which would save all the sed and jq stuff.

Here is the Python I knocked up

#!/usr/local/bin/python3
#
# Parse the json returned from an Alexa logon and create a cookie file
# in Netscape format.
#
# Requires one parameter which is the full path to the temporary directory.

from datetime import datetime

import json
import sys

if len(sys.argv) != 2:
        raise ValueError('Temporary directory not specified')

COOKIE_FILE = sys.argv[1] + '/.alexa.cookie'
COOKIE_JSON = COOKIE_FILE + '.json'

infile = open(COOKIE_JSON)
cookie_data = json.load(infile)

cookies = cookie_data['response']['tokens']['cookies']

domain = next(iter(cookies.keys()))

outfile = open(COOKIE_FILE, 'w')

for c in cookies[domain]:
        this_domain = domain
        if c['HttpOnly']:
                this_domain = '#HttpOnly_' + domain

        line = this_domain + '\t' + \
               'TRUE' + '\t'+ \
               c['Path'] + '\t' + \
               str(c['Secure']).upper() + '\t' + \
               str(int(datetime.strptime(c['Expires'], '%d %b %Y %H:%M:%S %Z').timestamp())) + '\t' + \
               c['Name'] + '\t' + \
               c['Value'] + \
               '\n'
        outfile.write(line)

infile.close()
outfile.close()

And here is the diff where I modified alexa_remote_control.sh to call the above. I removed the toEpoch function.

ha@rapa-3:~/.homeassistant$ git diff scripts/alexa_remote_control.sh
diff --git a/scripts/alexa_remote_control.sh b/scripts/alexa_remote_control.sh
index 7ba62c2..1513ac2 100755
--- a/scripts/alexa_remote_control.sh
+++ b/scripts/alexa_remote_control.sh
@@ -127,6 +127,9 @@ SET_BROWSER='Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:65.0) Gecko/20100101 F
 # jq binary
 SET_JQ='/usr/bin/jq'

+# Cookie parser
+COOKIE_PARSER=/config/scripts/cookie_parser.py
+
 # tmp path
 #SET_TMP="/tmp"
 SET_TMP="/config/tmp"
@@ -513,25 +516,8 @@ if [ -z "${REFRESH_TOKEN}" ] ; then
 else
 #      ${CURL} ${OPTS} -s -X POST --data "app_name=Amazon%20Alexa&requested_token_type=auth_cookies&domain=www.${AMAZON}&source_token_type=refresh_token" --data-urlencode "source_token=${REFRESH_TOKEN}" -H "x-amzn-identity-auth-domain: api.${AMAZON}" https://api.${AMAZON}/ap/exchangetoken/cookies | ${JQ} -r '.response.tokens.cookies | to_entries[] | .key as $domain | .value[] | map_values(if . == true then "TRUE" elif . == false then "FALSE" else . end) | .Expires |= ( strptime("%d %b %Y %H:%M:%S %Z") | mktime ) | [(if .HttpOnly=="TRUE" then ("#HttpOnly_" + $domain) else $domain end), "TRUE", .Path, .Secure, .Expires, .Name, .Value] | @tsv' > ${COOKIE}

-
-       # workaround for cookies valid beyond 2038-01-19 on 32-bit systems
-       toEpoch() {
-               local x
-               while read x
-               do
-                       echo "$x" | awk '{
-                               if ($3 >= 2038) {
-                                       print "s/"$1" "$2" "$3" "$4" "$5"/2147483647/g"
-                               } else {
-                                       print "s/"$1" "$2" "$3" "$4" "$5"/'"$(set +e; date -d "$x" -u +"%s" 2>/dev/null || date -d "$x" -D "%d %b %Y %H:%M:%S %Z" -u +"%s" 2>/dev/null || date -j -f "%d %b %Y %H:%M:%S %Z" "$x" +"%s" 2>/dev/null )"'/g"
-                               }
-                       }'
-               done
-       }
-
        ${CURL} ${OPTS} -s -X POST --data "app_name=Amazon%20Alexa&requested_token_type=auth_cookies&domain=www.${AMAZON}&source_token_type=refresh_token" --data-urlencode "source_token=${REFRESH_TOKEN}" -H "x-amzn-identity-auth-domain: api.${AMAZON}" https://api.${AMAZON}/ap/exchangetoken/cookies > ${COOKIE}.json
-       sed -e "$(cat ${COOKIE}.json | ${JQ} -r '.response.tokens.cookies | to_entries[] | .key as $domain | .value[] | .Expires' | toEpoch)" ${COOKIE}.json |\
-        ${JQ} -r '.response.tokens.cookies | to_entries[] | .key as $domain | .value[] | map_values(if . == true then "TRUE" elif . == false then "FALSE" else . end) | [(if .HttpOnly=="TRUE" then ("#HttpOnly_" + $domain) else $domain end), "TRUE", .Path, .Secure, .Expires, .Name, .Value] | @tsv' > ${COOKIE}
+       "$COOKIE_PARSER" "${COOKIE}"

        if [ -z "$(grep "\.${AMAZON}.*\sat-" ${COOKIE})" ] ; then
                echo "ERROR: cookie retrieval with refresh_token didn't work"
(END)

Now, I have a functioning alexa_remote_control.sh again and I am a happy person.

Obviously I have trampled all over your code and this is only for my use (and anyone else who finds themselves here looking for a quick/temporary fix). You (@adn77) will have a better view on how you might want to deal with this, which is essentially just a broken date utility. I know supporting various date utils has been a pain for you and it doesn't feel like it's going away. I guess you won't want to use my Python implementation since many people may not even have Python installed, but the time format returned by Amazon is hard to deal with robustly in a shell script.

What do you think? Happy to help if I can.

von-Chaps commented 7 months ago

Sorry, didn't see your request...

docker pull ghcr.io/home-assistant/home-assistant:2024.4.1

I guess Amazon's expiry dates (and all the json in fact) are designed to be parsed by Javascript. Shrug.

von-Chaps commented 7 months ago

Incidentally, if you really wish to remain pure bash and since you are already depending on jq, did you know you can do this;

$ echo '"7 Apr 2024 21:11:12 GMT"' | jq 'strptime("%d %b %Y %H:%M:%S %Z") | mktime'
1712527872

which should help get rid of toEpoch() and all its problems. It is still a bit unreliable with respect to timezones, but that is always going to be a problem on platforms that have different tzdata implementations installed.

adn77 commented 7 months ago

That's actually how I started out a couple of years ago 😁 But then strptime wasn't consistent in jq across different platforms (e.g. only supporting ISO 8601).

von-Chaps commented 7 months ago

Oh. I'm actually surprised by that. I would have thought jq was simply calling strptime(3) and, in fact, if the platform doesn't provide strptime(3), jq actually contains an internal implementation which is a clone of NetBSD's strptime. I would have expected it to work well enough. Also, if jq can't do it by calling the native platform, then I'm surprised that date can, since that too should call the same strptime(3) library.... except in BusyBox, of course, where it has next to no capabilities as we have seen.

adn77 commented 7 months ago

I now set a fixed validity of the cookies to "now + 86400s". The session cookies always expired after exactly one day (for the past 2.5 years). Worst case we either ask for new cookies to often (once in 24 hours is a lot less than I've seen in other implementations), or don't ask soon enough - in that case any request will fail and a new cookie will be asked for anyways.

Thanks for bearing with me 😄

von-Chaps commented 7 months ago

Yes, I saw that. Seems like a reasonable approach. Certainly v0.22 fixes my problem - can verify it works in Home Assistant 2024.4.1 using BusyBox shell.

Thanks for being so responsive and a pleasure to interact with - and thanks for giving your time to support such a useful utility.

You can go ahead and close this as fixed if you wish.