dgadling / adp-scrape

Code for downloading pay stubs from adp.com
MIT License
1 stars 1 forks source link

login failure #2

Open TomGoBravo opened 4 years ago

TomGoBravo commented 4 years ago

I tried using this handy looking tool recently but got the following crash:

  File "/home/tombrown/.pyenv/versions/adp-scrape/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/tombrown/.pyenv/versions/adp-scrape/lib/python3.6/site-packages/adp-1.0-py3.6.egg/adp.py", line 203, in cli
  File "/home/tombrown/.pyenv/versions/adp-scrape/lib/python3.6/site-packages/adp-1.0-py3.6.egg/adp.py", line 180, in download_needed
  File "/home/tombrown/.pyenv/versions/adp-scrape/lib/python3.6/site-packages/adp-1.0-py3.6.egg/adp.py", line 78, in get_uid
  File "/home/tombrown/.pyenv/versions/3.6.8/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/home/tombrown/.pyenv/versions/3.6.8/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/tombrown/.pyenv/versions/3.6.8/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I ran it in a debugger and the requests response looks different from when I try an incorrect password.

Using Chrome's Copy as cURL it looks like there is a bunch of data (a few kB that look almost like base64url encoded binary except for some single %3D in the middle of the values) included with the POST data beyond basics currently sent by adp.py. With curl when the extra data is included I get a good login response and with only user, password, target, redirectUrl I get an error. I don't see these values in the HTML form hidden inputs so I'm guessing they are added by javascript. I haven't found them in the javascript yet.

TomGoBravo commented 4 years ago

I've made very little progress on extracting the POST values for X-zuY25QsG-f and ...-b/c/d/z/a. ...-f is in local storage verbatim with key f, I can't find where in the js these get added to the POST request. A Google search for zuY25QsG pulls up Broadcom Configure HTML Forms Authentication but I don't see that string on the page. Perhaps running a real browser as done by https://github.com/liamCorbett/adp-webscrape is the way to go. :-/

dgadling commented 4 years ago

I appreciate you looking into this, but I agree it's likely Selenium style is the only way to go at this point. "Back in the day" this was sufficient, but to be honest I'm a little glad something this simple doesn't work any longer 😅

TomGoBravo commented 4 years ago

Hey David, thank you for checking back on this. I'm not sure I agree that there is any more security when scrapers need to upgrade to Selenium but oh well. I also bumped into some javascript one can run in the chrome console to download after logging in manually: https://gist.github.com/azagniotov/210c31540712c10206484d5297616842

jewelhuq commented 2 months ago

any update that works for 2024? need full code, please.