Open ThermoMan opened 3 years ago
Update. On subsequent executions it crashes long before this step. But if I delete %APPDATA%/Local/pypuppeteer and force a re-download it once again shows the Chromium window - with the related lack of permission message
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
100%|███████████████████████████████████████████████████████████████████████████| 136913619/136913619 [00:18<00:00, 7237785.71it/s]
[W:pyppeteer.chromium_downloader]
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: C:\Users\ThermoMan\AppData\Local\pyppeteer\pyppeteer\local-chromium\588429
Signing in.. (please wait, it might take awhile)
Sign in failed. Trying one more time..
Traceback (most recent call last):
File "main.py", line 5, in <module>
File "KrogerCLI.py", line 81, in prompt_options
File "KrogerCLI.py", line 147, in _option_account_info
File "Memoize.py", line 28, in __call__
File "KrogerAPI.py", line 37, in get_account_info
File "asyncio\runners.py", line 43, in run
File "asyncio\base_events.py", line 616, in run_until_complete
File "KrogerAPI.py", line 139, in _get_account_info
File "KrogerAPI.py", line 241, in sign_in_routine
File "KrogerAPI.py", line 254, in sign_in
File "lib\site-packages\pyppeteer\page.py", line 1546, in click
File "lib\site-packages\pyppeteer\frame_manager.py", line 583, in click
pyppeteer.errors.PageError: No node found for selector: #SignIn-emailInput
[7060] Failed to execute script main
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "lib\site-packages\pyppeteer\launcher.py", line 151, in _close_process
File "asyncio\base_events.py", line 591, in run_until_complete
File "asyncio\base_events.py", line 508, in _check_closed
RuntimeError: Event loop is closed
sys:1: RuntimeWarning: coroutine 'Launcher.killChrome' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "lib\site-packages\pyppeteer\launcher.py", line 151, in _close_process
File "asyncio\base_events.py", line 591, in run_until_complete
File "asyncio\base_events.py", line 508, in _check_closed
RuntimeError: Event loop is closed
Thanks for reaching out @ThermoMan
I have not had a chance to look into this issue deeply, however my attempts to solve it were unsuccessful.
The Kroger website has somewhat sophisticated automation detection algorithm. Somehow it is able to detect the headless Chrome that is being used here (by the Pypuppeteer).
Some things to try to go around the Access Denied issue (which do not work reliably):
user-agent
;userDataDir
and --no-sandbox
options;headless
flag to False.I'll give any change you make a test. I'm using the pre-compiled version and am unfamiliar enough with Python that I won't be able to make those changes myself.
Looking at my own chrome user agent string, the only difference is the version number, they are probably not triggering off of that.
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.72 Safari/537.36
It strikes me that
'--blink-settings=imagesEnabled=false'
might also be a setting to change to true - the closer to a real browsing experience that it looks the better perhaps.
I tried using GreaseMonkey scripts to access the purchase history but since each shipping trip is a link to a new page you cannot automate it at the top level. And of course they block using iframes, so you cannot stick your script into a local page and scrape the iframe. The next step is to use something like a shell script and wget. I've done screen scrapers this way before, but nothing with session management.
If you made those suggested variable changes into config file items I could test the heck out of them and find a combination that works.
I found another project that also uses headless chrome and while it may have worked 2 years ago also has the same problem today. phyllis-vance/KrogerScrape#1
Here is one written in node.js that has comments about overcoming our specific issue. https://github.com/agg23/kroger
Kroger has a public API https://developer.kroger.com/reference/ Although it looks like it only supports present shopping, not historical shopping trips.
I'm also experiencing this bug :( I've tried your suggested workarounds, but no luck.
everyone, i was able to fix this issue by going into edge site permissions and setting "allow" permissions for popups and redirects, insecure content,. the browser opened when running the .exe and authenticated.
@jeffdroibnson I tried this with no success. I confirmed that the local installation of Chromium had this setting disabled.
@ThermoMan https://github.com/agg23/kroger appears to implement a function for bypassing detection measures. I'm not exactly a software engineer, but I can try implementing that into kroger-cli to see if that works. See line 28 in index.ts.
@ThermoMan https://github.com/agg23/kroger appears to implement a function for bypassing detection measures. I'm not exactly a software engineer, but I can try implementing that into kroger-cli to see if that works. See line 28 in index.ts.
@christianpetty Were you ever able to resolve this? Just curious. I ran into this project today.
@omamated Hey I never got around to it but you've sparked my interest in this again possibly.
+1 to fixing this. The auto clip of coupons would be amazing.
I gave this a shot based on the typescript code suggested. You can see my python implementation here: https://github.com/akump/kroger-cli/commit/d25ea19f033343fb9a04ce3e1464e9374e2aaa4b. Still getting access denied loading the kroger URL though.
That’s a shame. Any other ideas?
Had some bugs with the python code. Made progress but still no luck. If anyone tries this in the future, start with my code: https://github.com/akump/kroger-cli/blob/master/kroger_cli/api.py#L227. I tried: using normal chrome, removing webdriver references from nav obj, blocking known bad urls, modifying user agent, and essentially attempting to mimic a real browser.
Hey all. I learned about this company that allows proxy networks for large scale web scraping. Unfortunately is costs money. It's unclear how much it would cost to do kroger-cli's use cases, but it might be worth looking into. Heres a link: https://brightdata.com/products/scraping-browser?gspk=bm9haGthbHNvbjU1MQ&gsxid=scXFKQ0z4DqI&hs_signup=1&promo=fireship&pscd=get.brightdata.com&utm_campaign=bm9haGthbHNvbjU1MQ&utm_medium=pres&utm_source=affiliates. I learned about it from fireship: https://www.youtube.com/watch?v=qo_fUjb02ns. Looks like it might be what this repo needs.
I see the same as https://github.com/Shmakov/kroger-cli/issues/6#issuecomment-785507486. However, given the 150 coupon limit per account this may be a dead end unless there's some filtering options added as well.
This issue seems to be due pyppeteer being outdated and unmaintained.
Kroger isn't "detecting" headless-chromium, but rather its calls are breaking. If you open the sign-in page on a regular updated browser you can see the "sign-in" button starts disabled, then is enabled after ~.5 seconds. In pyppeteer's chromium it never enables the button and you can see tons of error messages in chromium's developer console that are absent from a regular Chrome developer console. Going to chrome://version
in a Pyppeteer's chromium show's it's using version Chromium 71.0.3542.0 (Developer Build) (64-bit)
Following Pyppeteer's recommendation of using Playwright-Python. Playwright seems to use modern browsers as chrome://version
returns Chromium 115.0.5790.75 (Developer Build) (64-bit)
. Sure enough, testing playwright open https://www.kingsoopers.com/signin
to enter the King Soopers (Kroger) login page, everything works as expected.
@Shmakov @akump, any chance of reworking Pyppeteer calls for Playwright-Python?
Not sure if this thread has been solved or not. I find that using undetected-chromedriver can log into the account. That could be a fix.
Not sure if this thread has been solved or not. I find that using undetected-chromedriver can log into the account. That could be a fix.
@lowrank could you explain how you got the undetected chromedriver to work, I am having issues with it just hanging
Has anybody gotten this to work? I'm keen to find a solution. I LOVE the survey automation, but more importantly I want to pull my fuel points summary and load it into my Home Assistant (Compare Kroger fuel w/ current fuel points discount vs other fuel pricing). This would be awesome tool to use for that. @lowrank @ThermoMan @Shmakov If nobody has, I'm happy to help with some $ if we need to find somebody else out in upwork/fiver that can help with the modification to the code if the undetected-chromedriver or playwright-python is the way to go!
App tries to open https://www.kroger.com/signin?redirectUrl=/account/update
Result is. Access Denied You don't have permission to access "http://www.kroger.com/signin?" on this server. Reference #18.2405e8ac.1614214130.9923895
If I manually use that URL in Chrome or Firefox it works (even in incognito mode). So there is some problem with the Chrome engine that the app is using.
In another test run, the app crashed, leaving the chrome window open - from that window I got that same error message even using it manually.