Shmakov / kroger-cli

Clip coupons and earn points at Kroger-owned grocery chains
121 stars 18 forks source link

Kroger Login Issue #6

Open ThermoMan opened 3 years ago

ThermoMan commented 3 years ago

App tries to open https://www.kroger.com/signin?redirectUrl=/account/update

Result is. Access Denied You don't have permission to access "http://www.kroger.com/signin?" on this server. Reference #18.2405e8ac.1614214130.9923895

If I manually use that URL in Chrome or Firefox it works (even in incognito mode). So there is some problem with the Chrome engine that the app is using.

In another test run, the app crashed, leaving the chrome window open - from that window I got that same error message even using it manually.

ThermoMan commented 3 years ago

Update. On subsequent executions it crashes long before this step. But if I delete %APPDATA%/Local/pypuppeteer and force a re-download it once again shows the Chromium window - with the related lack of permission message

[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
100%|███████████████████████████████████████████████████████████████████████████| 136913619/136913619 [00:18<00:00, 7237785.71it/s]
[W:pyppeteer.chromium_downloader]
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: C:\Users\ThermoMan\AppData\Local\pyppeteer\pyppeteer\local-chromium\588429
Signing in.. (please wait, it might take awhile)
Sign in failed. Trying one more time..
Traceback (most recent call last):
  File "main.py", line 5, in <module>
  File "KrogerCLI.py", line 81, in prompt_options
  File "KrogerCLI.py", line 147, in _option_account_info
  File "Memoize.py", line 28, in __call__
  File "KrogerAPI.py", line 37, in get_account_info
  File "asyncio\runners.py", line 43, in run
  File "asyncio\base_events.py", line 616, in run_until_complete
  File "KrogerAPI.py", line 139, in _get_account_info
  File "KrogerAPI.py", line 241, in sign_in_routine
  File "KrogerAPI.py", line 254, in sign_in
  File "lib\site-packages\pyppeteer\page.py", line 1546, in click
  File "lib\site-packages\pyppeteer\frame_manager.py", line 583, in click
pyppeteer.errors.PageError: No node found for selector: #SignIn-emailInput
[7060] Failed to execute script main
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "lib\site-packages\pyppeteer\launcher.py", line 151, in _close_process
  File "asyncio\base_events.py", line 591, in run_until_complete
  File "asyncio\base_events.py", line 508, in _check_closed
RuntimeError: Event loop is closed
sys:1: RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "lib\site-packages\pyppeteer\launcher.py", line 151, in _close_process
  File "asyncio\base_events.py", line 591, in run_until_complete
  File "asyncio\base_events.py", line 508, in _check_closed
RuntimeError: Event loop is closed
Shmakov commented 3 years ago

Thanks for reaching out @ThermoMan

I have not had a chance to look into this issue deeply, however my attempts to solve it were unsuccessful.

The Kroger website has somewhat sophisticated automation detection algorithm. Somehow it is able to detect the headless Chrome that is being used here (by the Pypuppeteer).

Some things to try to go around the Access Denied issue (which do not work reliably):

ThermoMan commented 3 years ago

I'll give any change you make a test. I'm using the pre-compiled version and am unfamiliar enough with Python that I won't be able to make those changes myself. Looking at my own chrome user agent string, the only difference is the version number, they are probably not triggering off of that. Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.72 Safari/537.36 It strikes me that '--blink-settings=imagesEnabled=false' might also be a setting to change to true - the closer to a real browsing experience that it looks the better perhaps.

ThermoMan commented 3 years ago

I tried using GreaseMonkey scripts to access the purchase history but since each shipping trip is a link to a new page you cannot automate it at the top level. And of course they block using iframes, so you cannot stick your script into a local page and scrape the iframe. The next step is to use something like a shell script and wget. I've done screen scrapers this way before, but nothing with session management.

If you made those suggested variable changes into config file items I could test the heck out of them and find a combination that works.

I found another project that also uses headless chrome and while it may have worked 2 years ago also has the same problem today. phyllis-vance/KrogerScrape#1

Here is one written in node.js that has comments about overcoming our specific issue. https://github.com/agg23/kroger

Kroger has a public API https://developer.kroger.com/reference/ Although it looks like it only supports present shopping, not historical shopping trips.

akump commented 3 years ago

I'm also experiencing this bug :( I've tried your suggested workarounds, but no luck.

jeffdroibnson commented 3 years ago

everyone, i was able to fix this issue by going into edge site permissions and setting "allow" permissions for popups and redirects, insecure content,. the browser opened when running the .exe and authenticated.

christianpetty commented 3 years ago

@jeffdroibnson I tried this with no success. I confirmed that the local installation of Chromium had this setting disabled.

christianpetty commented 3 years ago

@ThermoMan https://github.com/agg23/kroger appears to implement a function for bypassing detection measures. I'm not exactly a software engineer, but I can try implementing that into kroger-cli to see if that works. See line 28 in index.ts.

omamated commented 1 year ago

@ThermoMan https://github.com/agg23/kroger appears to implement a function for bypassing detection measures. I'm not exactly a software engineer, but I can try implementing that into kroger-cli to see if that works. See line 28 in index.ts.

@christianpetty Were you ever able to resolve this? Just curious. I ran into this project today.

christianpetty commented 1 year ago

@omamated Hey I never got around to it but you've sparked my interest in this again possibly.

akump commented 1 year ago

+1 to fixing this. The auto clip of coupons would be amazing.

akump commented 1 year ago

I gave this a shot based on the typescript code suggested. You can see my python implementation here: https://github.com/akump/kroger-cli/commit/d25ea19f033343fb9a04ce3e1464e9374e2aaa4b. Still getting access denied loading the kroger URL though.

christianpetty commented 1 year ago

That’s a shame. Any other ideas?

akump commented 1 year ago

Had some bugs with the python code. Made progress but still no luck. If anyone tries this in the future, start with my code: https://github.com/akump/kroger-cli/blob/master/kroger_cli/api.py#L227. I tried: using normal chrome, removing webdriver references from nav obj, blocking known bad urls, modifying user agent, and essentially attempting to mimic a real browser.

akump commented 1 year ago

Hey all. I learned about this company that allows proxy networks for large scale web scraping. Unfortunately is costs money. It's unclear how much it would cost to do kroger-cli's use cases, but it might be worth looking into. Heres a link: https://brightdata.com/products/scraping-browser?gspk=bm9haGthbHNvbjU1MQ&gsxid=scXFKQ0z4DqI&hs_signup=1&promo=fireship&pscd=get.brightdata.com&utm_campaign=bm9haGthbHNvbjU1MQ&utm_medium=pres&utm_source=affiliates. I learned about it from fireship: https://www.youtube.com/watch?v=qo_fUjb02ns. Looks like it might be what this repo needs.

bedge commented 1 year ago

I see the same as https://github.com/Shmakov/kroger-cli/issues/6#issuecomment-785507486. However, given the 150 coupon limit per account this may be a dead end unless there's some filtering options added as well.

Malakii commented 1 year ago

This issue seems to be due pyppeteer being outdated and unmaintained.

Kroger isn't "detecting" headless-chromium, but rather its calls are breaking. If you open the sign-in page on a regular updated browser you can see the "sign-in" button starts disabled, then is enabled after ~.5 seconds. In pyppeteer's chromium it never enables the button and you can see tons of error messages in chromium's developer console that are absent from a regular Chrome developer console. Going to chrome://version in a Pyppeteer's chromium show's it's using version Chromium 71.0.3542.0 (Developer Build) (64-bit)

Following Pyppeteer's recommendation of using Playwright-Python. Playwright seems to use modern browsers as chrome://version returns Chromium 115.0.5790.75 (Developer Build) (64-bit). Sure enough, testing playwright open https://www.kingsoopers.com/signin to enter the King Soopers (Kroger) login page, everything works as expected.

@Shmakov @akump, any chance of reworking Pyppeteer calls for Playwright-Python?

lowrank commented 1 month ago

Not sure if this thread has been solved or not. I find that using undetected-chromedriver can log into the account. That could be a fix.

haqthat commented 3 weeks ago

Not sure if this thread has been solved or not. I find that using undetected-chromedriver can log into the account. That could be a fix.

@lowrank could you explain how you got the undetected chromedriver to work, I am having issues with it just hanging

markheitz commented 6 days ago

Has anybody gotten this to work? I'm keen to find a solution. I LOVE the survey automation, but more importantly I want to pull my fuel points summary and load it into my Home Assistant (Compare Kroger fuel w/ current fuel points discount vs other fuel pricing). This would be awesome tool to use for that. @lowrank @ThermoMan @Shmakov If nobody has, I'm happy to help with some $ if we need to find somebody else out in upwork/fiver that can help with the modification to the code if the undetected-chromedriver or playwright-python is the way to go!