ArthurG / Kijiji-Repost-Headless

Repost / Post Kijiji ads with Python
MIT License
144 stars 59 forks source link

KijijiApiException: Could not log in. #176

Closed eltimsy closed 4 years ago

eltimsy commented 4 years ago

Starting today I've been getting this exception when I try to repost an ad. In the html dump it does specific the issue just says something went wrong. Anyone have any ideas what the cause might be?

ilirakshe commented 4 years ago

I think it happens coz Kijiji now implement a google captcha on their login page.

ArthurG commented 4 years ago

I'm open to any proposals and implementations to get around this Captcha issue.

Corgano commented 4 years ago

IS it possible to show the user the captcha, and then have the program continue? OR make it use cookies so the user can log in, and it will reupload using whatever account is signed in?

ilirakshe commented 4 years ago

If i understand ( i can be very wrong, please correct me ) they use recaptcha 2.0 (or may be even third version of it) it means this thing detects automation and bots in many cases. don't know will cookies helps a lot.

eltimsy commented 4 years ago

Yeah it looks like they are using invisible recaptcha 2.0

ArthurG commented 4 years ago

Are they using recaptcha for just the login? Or also for the posting page?

Also, this might be useful: https://developers.google.com/recaptcha/docs/versions

anonwhitemouse commented 4 years ago

Might help:

https://github.com/ecthros/uncaptcha2

Connor2hd commented 4 years ago

I have noticed the recaptcha badges on the login and ad reply screens but not on posting an ad.

Browser automation and scripting isn't really my thing but couldn't the code be modified to save the cookie from logging in? If that's possible it sounds easier than integrating something to get around recaptcha.

ghost commented 4 years ago

I was just able to bypass the captcha using the following python script:

The trick is to use the time.sleep function with random milliseconds to fool the captcha into believing it's human interaction. And also using selenium which routes through an actual browser (this part might be optional, not sure). The slow down is not necessarily ideal, but the kijiji repost app already has a 3 minute delay. Just a heads up, you could completely avoid the 3 minute repost delay by having the user create 2 versions of the listing which is rotated each repost. As long as the first few words within the first sentence are different, it's detected as a new listing.

EDIT: It did pickup on my automated login attempts after a while. But I was sending successive tests, so it's no surprise it caught on. After multiple successful logins, I began receiving location selections opposed to the typical user homepage, which i'm guessing was kijiji's native attempt at thwarting high levels of automation. This was solved with the introduction of randomized time.sleep() functions between field entries. That and don't try to log in every minute :) The big thing is to not give it any kind of routine behaviour it can learn. Such as only repost 2x per day, (morning and afternoon or something), with those repostings also having some sort of randomized time variance incorporated.


import pickle
import random
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:75.0) Gecko/20100101 Firefox/75.0")
driver = webdriver.Firefox(profile)

driver.get ('https://www.kijiji.ca/t-login.html')

# Save Cookies to file
#pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
# Load Cookies from file
#cookies = pickle.load(open("cookies.pkl", "rb"))
#for cookie in cookies:
#    driver.add_cookie(cookie)

# Save Cookies to Variable
#cookies = driver.get_cookies()

try:
    # important to wait until elements become available, or else get 'nosuchelement' exception from selenium
    element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'LoginEmailOrNickname')))
finally:
    driver.find_element_by_id('LoginEmailOrNickname').send_keys('YOUR_USERNAME')
    time.sleep(random.random())
    driver.find_element_by_id('login-password').send_keys('YOUR_PASSWORD')
    time.sleep(random.random())
    driver.find_element_by_id('login-rememberMe').send_keys('true')
    time.sleep(random.random())
    driver.find_element_by_id('SignInButton').click()
    time.sleep(10) # This is just to let the user actually see something! Can be removed.
    driver.quit()
IH3lios commented 4 years ago

Is there any hope of a possible resolution?

ArthurG commented 4 years ago

We probably can fix this. No ETA on this one. If someone wants to take a stab, please go ahead!

davidvexel commented 4 years ago

@ArthurG What would be the plan to fix this? Implement something for the captcha? use a cookie to keep the session open?

I could probably take a look if you have any ideas.

ArthurG commented 4 years ago

The most optimal way would be to find another library to solve captchas.

If that doesn't work, I think we could

michaelkuzmin commented 4 years ago
  • Use Selenium to open a browser, prompt the user to log in

ideally I would like to receive the captcha over telegram and reply to it to resolve. I think this would be the perfect solution.

jingcao commented 4 years ago

Currently when I try to open login page via a session/requests I get a javascript. Is there anyway to bypass this? Is this considered recaptcha?

I think due to the new site changes, one can't use requests anymore as it doesn't render javascript which is used to grab additional data.

ie: b'<html><head><meta charset="utf-8"><script>function i700(){}i700.F20=function (){return typeof i700.O20.p60===\'function\'?i700.O20.p60.apply(i700.O20,arguments):i700.O20.p60;};i700.X70=function (){return typeof i700.v70.p60===\'function\'?i700.v70.p60.apply(i700.v70,arguments):i700.v70.p60;};i700.Z20=function (){return typeof i700.O20.P20===\'function\'?i700.O20.P20.apply(i700.O20,arguments):i700.O20.P20;};i700.Q60=function (){return typeof i700.Y60.P20===\'function\'?i700.Y60.P20.apply(i700.Y60,arguments):i700.Y60.P20;};i700.a70=function (){return typeof i700.v70.d4===\'function\'?i700.v70.d4.apply(i700.v70,arguments):i700.v70.d4;};i700.P60=function (){return typeof i700.Y60.p60===\'function\'?i700.Y60.p60.apply(i700.Y60,arguments):i700.Y60.p60;};i700.r60=function (){return typeof i700.Y60.U20===\'function\'?i700.Y60.U20.apply(i700.Y60,arguments):i700.Y60.U20;};i700.t20=function (){return typeof i700.O20.

vic-c commented 4 years ago

I agree with @michaelkuzmin that if we have to solve captcha, it's best to work in headless mode e.g. via a messenger. I'm running repost cron'ed on an ssh-only vm... so popping a browser won't work for such use case..

Side question, would a little gofundme campaign and a bounty help get @ArthurG or someone reputable to resolve it for good? I could pledge pay $30-40 myself and judging by the number of comments another dozen people could do the same!?

michaelkuzmin commented 4 years ago

@vic-c @ArthurG I would donate for sure. I imagine there are a lot of small business owners here (retail, real estate) that take advantage of this tool, and reviving it sooner rather than later would have a very tangible business value. so far I have not needed it, but very soon I am going to have to start to post ads manually and this will be pretty disruptive for my other work.

prgrm commented 4 years ago

@vic-c @michaelkuzmin @ArthurG I agree. I would donate $20.

michaelkuzmin commented 4 years ago

@ArthurG how would you like to handle it? we could organize the campaign and we could hand the funds to you? or we could hire someone on bountysource or something like that if you don't mind and don't have time to deal with it yourself?

IH3lios commented 4 years ago

I would be willing to pitch in as well.

jackm commented 4 years ago

After a brief investigation of the Kijiji login process it appears to me that the API has changed. How certain are we that it is captcha that is preventing the login from working? It could just be that we need to update to match the new site API.

Jester8813 commented 4 years ago

even the selenium version I wrote as soon as the screen try to load the login screen it is all blank. I don't know 100% if this is captcha or if they are figuring out if the browser is automated.

jackm commented 4 years ago

I believe I have a fix for this in #178

SachaTe commented 4 years ago

Thanks! With the recent merge it has started to work on my side!

IH3lios commented 4 years ago

Some of my posts are generation login errors. Is this normal?

Deletion successful Waiting 3 minutes before posting again. Please do not exit this script. Still waiting; 2 more minutes... Still waiting; 1 minute left... Still waiting; 30 seconds... Still waiting; just 10 seconds... Posting Ad now Image upload success on try #1 Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "kijiji_repost_headless/main.py", line 200, in main() File "kijiji_repost_headless/main.py", line 51, in main args.function(args) File "kijiji_repost_headless/main.py", line 160, in repost_ad post_ad(args) File "kijiji_repost_headless/main.py", line 96, in post_ad if not check_ad(args): File "kijiji_repost_headless/main.py", line 172, in check_ad api.login(args.username, args.password) File "kijiji_repost_headless/kijiji_api.py", line 121, in login raise KijijiApiException("Could not log in.", resp.text) kijiji_api.KijijiApiException: Could not log in. See kijijiapi_dump_20200523T204946.html in current directory for latest dumpfile.

IH3lios commented 4 years ago

@rybodiddly I am passing the credentials via the command line and using my email.

python3 kijiji_repost_headless -u **@hotmail.com -p ***** repost ./item.yml

The exact error message that is being generated is the following.

{"errors":[{"message":"Something went wrong during authentication.","locations":[{"line":2,"column":3}],"path":["loginUser"],"extensions":{"code":"UNAUTHENTICATED","exception":{"statusCode":401,"errorCode":"LOGIN_RECAPTCHA_FAIL","args":{"xsrfToken":"1590285297468.a511478f452e1ddc68385879b04bd40a8168dae7f15851a7c87260a1e02bea38","emailOrNickname":"**@hotmail.com","targetUrl":null,"password":"**","rememberMe":true,"fraudToken":null,"hints":["NEW_AJAX_LOGIN"]}}}}],"data":{"loginUser":null}}

Connor2hd commented 4 years ago

@IH3lios I just noticed today that I get the unauthenticated error but my ad is posted anyway. Does your ad still get posted despite the error? Look at the HTML dump file as well, it normally gives you the error from the Kijiji API I believe.

IH3lios commented 4 years ago

@Connor2hd Some posts do still get posted regardless of the error, but not all of them.

The HTML dump shows the error message I showed in my last comment.

{"errors":[{"message":"Something went wrong during authentication.","locations":[{"line":2,"column":3}],"path":["loginUser"],"extensions":{"code":"UNAUTHENTICATED","exception":{"statusCode":401,"errorCode":"LOGIN_RECAPTCHA_FAIL","args":{"xsrfToken":"1590285297468.a511478f452e1ddc68385879b04bd40a8168dae7f15851a7c87260a1e02bea38","emailOrNickname":"@hotmail.com","targetUrl":null,"password":"","rememberMe":true,"fraudToken":null,"hints":["NEW_AJAX_LOGIN"]}}}}],"data":{"loginUser":null}}

IH3lios commented 4 years ago

Interestingly enough it is working now after the 5th attempt...

IH3lios commented 4 years ago

@rybodiddly It seems like it working, but the Repatcha aspect of it is causing it to fail once in a while.

prgrm commented 4 years ago

Do we have to log in every time we post an ad? Can't we log in once and save the login credentials somewhere for subsequent ads?

vic-c commented 4 years ago

@jackm looks like your fix takes care over API issue, but most of the time posting fails with "errorCode":"LOGIN_RECAPTCHA_FAIL"

It works seldomly after I login via browser / do something / log-out and then run kijiji repost. If I just try to run it on it's own even after a long wait it fails with the error above.

Are there any additional "pip" installation instructions or anything else that might help the new version work?

michaelkuzmin commented 4 years ago

The main error I've found and been working on is that Kijiji is deleting posts after successful posting / submission. They show active for a second then get deleted. Switching between multiple versions of the ad seems to solve the issue. (i.e. not posting the identical ad all the time).

@rybodiddly this is not new, similar ads have always been posted and deleted after a few seconds. you should not try to flood the site with similar ads anyway, I think it's really crossing the line of what is ethical use of the resource. what I do is I have a routine that wipes all ads with nuke, and then posts all ads in my "active" folder one by one. when my ad works I just move the yml file from "active" to "archive" folder.

vic-c commented 4 years ago

Hi @michaelkuzmin, looks like rybodiddly's post got removed. I wonder if anyone found a way to use the current version of the tool to delete / post ads reliably (without flooding the marketplace with junk and without posting duplicates).

Before this change I had a simple bash wrapper script that would delete old add, wait a few minutes, re-post similar ad but with changes. I'd run that once or twice a week. My end goal is to keep permanent visibility to 5-10 distinct ads not to flood it with 100's duplicates. This approach no longer works as most of the time "captcha" login issue pops before ad can be deleted or posted.

Now with intermittent login issues I debate if looping each command "until it succeeds" is a smart thing to do. Doing so means account will have 70% captcha login failures. Even if it works for a few weeks, to me it screams to be audited / blocked in the long-term.

michaelkuzmin commented 4 years ago

@vic-c in the pre-update I was using "nuke" function to delete all ads the wait a couple of minutes, then post new ones one by one with no changes. This never failed for me. I agree that looping until it is successful sounds very risky.