matomatical / UoM-WAM-Spam

Scraper for the UoM results page which detects transcript updates and notifies the user.
MIT License
60 stars 22 forks source link

Login script needs updating #25

Open zcduthie opened 2 years ago

zcduthie commented 2 years ago

I think this is now broken with the introduction / requirement of Okta SSO. Can anyone else confirm this?

The way the Okta SSO widget works with local Javascript execution, Python requests / BeatifulSoup may need to be replaced. One solution might be to move back to Selenium. (And requires Push Notification or OTP) image image

matomatical commented 2 years ago

I can't confirm myself right now. I thought we had this working within the time that unimelb has used Okta, but maybe not, or maybe some recent update has hardened their security around accessing the relevant links for WAM-Spam. If true, this seems like a pretty serious issue for WAM-Spam. It seems like it will take some work to get around this. I am not available for this work at the moment, so please, if someone is reading this, consider volunteering (for example, if you roughly understand the following).

The first step will be to confirm and reproduce the behaviour pointed out in the issue. The second (harder) step is to come up with a workaround. I have sketched an idea below:

Naively I think it may be possible to work around the issue with requests (without switching back to Selenium). If so, I think this would be the best way forward.

It's my understanding that the "Okta" authentication layer unimelb uses is really just their preferred app for what is a standard time-based one-time password (TOTP) authentication system. Myself I have it set up so that I can authenticate with a code generated by my phone or smartwatch, neither of which support the Okta app or the unimelb-sanctioned replacement, which seems to be "Google authenticator". In the same way, it may be possible to go through some steps to get a TOTP token and give it to the WAM-Spam script. WAM-Spam could then use some Python library that probably exists to generate the TOTP on-demand, and handle all of this with requests.