merlinthemagic / MTS

Automation Tools for PHP
GNU Lesser General Public License v3.0
111 stars 29 forks source link

amazon gives page "please enable cookies to continue" #20

Closed ShayanArifButt closed 7 years ago

ShayanArifButt commented 7 years ago

i am trying to automate some process on amazon.com , so i was trying to log into amazon and when i take screenshot it gives the page "please enable cookies in your web browser". below is the image.

image

merlinthemagic commented 7 years ago

Show me how you setup the browser and make the request.

I.e: ` $myUrl = "https://www.amazon.com/"; $windowObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs')->getNewWindow($myUrl);

$windowObj->mouseEventOnElement("[id=loginButton]", 'leftclick'); $windowObj->sendKeyPresses("myUsername");

$windowObj->mouseEventOnElement("[id=passwordInput]", 'leftclick'); $windowObj->sendKeyPresses("myPassword");

$screenshotData = $windowObj->screenshot();

//this is where i see the cookies notice.... echo '';

`

Make sure to include user agent settings and if you have changed the tool in anyway. What OS are you on? Are you using the included PhantomJS binary or supplying your own?

ShayanArifButt commented 7 years ago

@merlinthemagic So i cloned the git repo , my system is windows 7 64 bit. Normal pages are fetching fine , just this login is giving this cookie page. i am using the included PhantomJS binary.

ini_set('max_execution_time', 120); require 'MTS/MTS/EnableMTS.php';

$myUrl = "https://www.amazon.com/gp/css/account/address/view.html?ie=UTF8&ref_=myab_view_new_address_form&viewID=newAddress&/";

$windowObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs')->getNewWindow($myUrl);

//left click on the search input box (it has id=ap_email): $windowObj->mouseEventOnElement("[id=ap_email]", 'leftclick');

//Type the search string we want to perform: $windowObj->sendKeyPresses("my_email");

//left click on the search input box (it has id=ap_password): $windowObj->mouseEventOnElement("[id=ap_password]", 'leftclick');

//Type the search string we want to perform: $windowObj->sendKeyPresses("my_password");

//left click on the search input box (it has id=ap_email): $windowObj->mouseEventOnElement("[id=signInSubmit]", 'leftclick');

//perform a screenshot: $screenshotData = $windowObj->screenshot();

//here the cookie page is shown: echo <img src="data:image/png;base64,' . base64_encode($screenshotData) . '" />

ShayanArifButt commented 7 years ago

@merlinthemagic

i set up the user agent now also ,

$agentName = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0"; $windowObj->setUserAgent($agentName);

but still the same message amazon gives

merlinthemagic commented 7 years ago

Cookies are enabled by default. Here are 2 things you might try.

Changing the user agent must be done BEFORE setting the URL, or it wont take effect.

make sure you setup like this: ` //get a window $windowObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs')->getNewWindow();

//set the agent $agentName = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0"; $windowObj->setUserAgent($agentName);

//now set the url $myUrl = "https://www.amazon.com/gp/css/account/address/view.html?ie=UTF8&ref_=myab_view_new_address_form&viewID=newAddress&/"; $windowObj->setUrl($myUrl);

//form logic `

Second, try dumping the return of:

$cookies = $windowObj->getCookies();

Do you see any cookies from Amazon?

ShayanArifButt commented 7 years ago

@merlinthemagic

thanks , i followed your instructions and it worked and did not show cookie page . but now i need a little help , looks like amazon detected that i am not a normal user because now it is showing this image ( asking the user to enter the code it sent to user's email )

image

so is there any way , i can display the current page to the user and when the user enters the code manually and submits , the script will continue from where it left off. in simple terms is there a way to save the current state of the page and continue when the user is done submitting the form.

merlinthemagic commented 7 years ago

Yes. However it is a feature that is rarely used and i have no tests for it (so it may have been broken along the way).

You are trying to maintain the state of the browser from sessions to session. I provided an answer on Stack exchange for a similar question regarding Captchas: link to question

You will need to build custom logic to handle destruction of sessions where the user never returns.

May i also suggest that you test if the validation email is triggered from another server IP / amazon account. Maybe this is an edge case that only presented itself because Amazon realized you tried to use PhantomJS to login.

I also cannot help but implore you to only use this tool for legitimate purposes. The page you link is adding a new address to an account. That makes me rather concerned.

ShayanArifButt commented 7 years ago

@merlinthemagic Thank you for the reply. Just wanted to say your browser automation library for php is so awesome and it worked out of the box while many others i tried failed to work out of the box. I do not know why your git repo did not show at the top of google results when i was searching for php browser automation/emulation, it should be at the top because its very good and easy to use. As for your concerns ,that is my own email account and i am trying to automate some process so wanted to test to submit forms etc , so nothing malicious i am trying.

ShayanArifButt commented 7 years ago

@merlinthemagic So i tried with another email as you stated , but the verify page is still showing. However when i use normal browser like Opera or Chrome , that page does not show and amazon logs me in without verification. Any idea how amazon knows i am accessing their page through headless browser/script ? actually i have tried to log into amazon using a custom phantomjs script file and it logged me in fine without asking for Verfication . any idea why its showing me that verification page when i use this php library ?

merlinthemagic commented 7 years ago

Is your PHP dev server hosted or at your house?

It's likely Amazon does a reverse lookup of the Src IP and if it originates from a major hosting provider they add additional validation.

It's not magic, there is a piece of information being transmitted that allows them to filter you.

MM

ShayanArifButt commented 7 years ago

@merlinthemagic

i am testing this at my home laptop , on XAMPP localhost. But the question is if they are filtering using my IP , then the verification page should also come on normal browser like Opera or Chrome, but there is no verification page when i am trying to log in using normal browser.

This page only shows when i try to log in using php script. Even if on the previous try with the script, it showed verification page , but on next try with normal browser using the same email , there will be no verification page.

and another question ( should i post it separately? ) , how do we use custom request headers and also how to add proxy for a request ? for example in CURL PHP i can store custom headers in an array , $header=array( 'Remote Address:216.58.205.102:443', 'Referrer Policy:no-referrer-when-downgrade', 'User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 OPR/46.0.2597.57' );

and then use: curl_setopt($ch,CURLOPT_HTTPHEADER,$header);

and for proxy i can use ;

$proxy = '66.70.191.215:1080'; curl_setopt($ch, CURLOPT_PROXY, $proxy);

so how will i do this in this browser automation library ?

ShayanArifButt commented 7 years ago

@merlinthemagic

so i solved the verification issue today by using the "keep alive" parameter which you stated in your stackoverflow answer. I am not talking about saving state of browser to manually submit verification code rather just by using the keep alive parameter , amazon did not show the verification page . Here is the code that i used;

`//open the login page: $myUrl = "https://www.amazon.com/gp/css/account/address/view.html?ie=UTF8&ref_=myab_view_new_address_form&viewID=newAddress"; $browserObj = \MTS\Factories::getDevices()->getLocalHost()->getBrowser('phantomjs');

//allow the page to live after php shuts down.
$browserObj->setKeepalive(true);
$windowObj      = $browserObj->getNewWindow();

$agentName  = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 OPR/46.0.2597.57";
$windowObj->setUserAgent($agentName);

$windowObj->setUrl($myUrl);`
merlinthemagic commented 7 years ago

Rather than the complicated "keep-alive" logic, maybe try to figure out how they know you are spoofing a real browser. They are likely correlating the user agent with the type of layout engine and its version. PhantomJS is webkit based, while outside of IOS Mozilla uses Gecko.

I.e. a check like: $.browser.chrome = $.browser.webkit && !!window.chrome; $.browser.safari = $.browser.webkit && !window.chrome;

if ($.browser.chrome) alert("You are using Chrome!"); if ($.browser.safari) alert("You are using Safari!");

Just a suggestion.