magsol / garmin

Some scripts I've thrown together to analyze my Garmin data.
MIT License
70 stars 34 forks source link

Garmin SSO changes #22

Open thigger opened 5 years ago

thigger commented 5 years ago

I haven't taken a backup since May 2018 but it looks like Garmin has changed things again:

Attempting to login to Garmin Connect...
Traceback (most recent call last):
  File "download.py", line 286, in <module>
    agent = login_user(username, password)
  File "download.py", line 191, in login_user
    login(agent, username, password)
  File "download.py", line 55, in login
    agent.open(script_url)
  File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 255, in _mech_open
    raise response
mechanize._response.httperror_seek_wrapper: HTTP Error 500: Internal Server Error

Visiting sso.garmin.com/sso/login has "an unexpected error has occurred" at the top.

Any ideas? I'll start to have a dig.

magsol commented 5 years ago

In looking at it for all of five seconds, my knee-jerk initial guess: the gauthHost variable needs to be checked for having the value https://sso.garmin.com/sso, as that's what currently shows up in the source of the current Garmin Connect login page.

If that's what's already in the code, then¯_(ツ)_/¯ we'll have to dig further

thigger commented 5 years ago

Seems to be the case already unfortunately. Looking at the tapiriik source there don't appear to have been any major changes there; perhaps it genuinely is a server error!

magsol commented 5 years ago

That's certainly also possible; the garmin connect service tends to fluctuate in availability :P

thigger commented 5 years ago

Solved it (sort of) - script_url is ending up as: https://sso.garmin.com/sso/socialSignIn?

I've manually changed it to what GC seems to be using: script_url = "http://sso.garmin.com/sso/signin" and it's working. I don't really speak regex but I'll have a look at pattern

thigger commented 5 years ago

Forgot to note back - the correct URL doesn't appear in the retrieved page so there's no ability to regex for it. Hardcoding it doesn't seem to be a problem given everything else that's hardcoded so I've just left it like that. Ideally needs to join the defines at the top I guess.

jorissen commented 5 years ago

I encountered the same issue today. I applied thigger's edit, and it worked for me, too. (Until they change something again ... )

jorissen commented 5 years ago

Correction -- the script downloaded ~300 files correctly, which is really great. Then it crashed on the next file with the HTTP Error 500: Internal Server Error. And when I repeat, it tries to download that same file, and crashes in the same way. I'll try it again later or tomorrow to see if waiting makes any difference.

jorissen commented 5 years ago

No change; the script still cannot move past that one workout from Jan 2015. If I delete the "Results" directory, then it will download the same ~300 files again and crash on that same exact file again.

When I go to the Garmin website and try to download this activity manually, I can do so successfully as "Export Original" and "Export to GPX" but I get an error if I chose "Export to TCX" (saying something like "start of interval cannot be 0"). I thought for a moment that this might be a clue (maybe the workout file is corrupt?). But some other workouts that also fail as TCX from the website, did download correctly using your tool. So that's not it. There's nothing else about the workout that looks like a red flag. All were done with the same Garmin watch.

Regardless, there might be some benefit in letting your download script trap errors, and move on to attempt downloading the next file -- it's possible that it will still be able to retrieve most files after skipping one or two problematic ones?

jorissen commented 5 years ago

There appears to be some merit in trapping errors. Each time a download failed, I then created a zero byte file with the expected file name. That causes the script to skip it and try to download the next file. Using this process, the script was able to download ~35 of the remaining ~50 files that I was still missing (i.e. in addition to the original ~300 that downloaded without a glitch).

Next I will go to the Garmin website and manually download the ~15 missing ones, so I have a full record.

It's not clear why these ~15 failed. Is there something different or buggy about them on the Garmin side? Is there an issue with TCX vs GPX exports? Is our script somehow buggy or runs into buffer issues when going back hundreds of workouts? Maybe the authors will have some ideas.

jorissen commented 5 years ago

OK, final note: For all the 15 files that the script fails to download, "Export to GPX" works from the website but "Export to TCX" results in a crash with error message "Lap start cannot be null".

However, for the other ~330 files that the script did successfully download, SOME let me "Export to TCX" from the website successfully while SOME produce that same crash.

So, I'm not sure if it has any bearing on what's happening. It could be interesting to modify the script to retrieve everything as GPX files, and see if that makes a difference. (I don't know how to do that.)