segment-boneyard / nightmare

A high-level browser automation library.
https://open.segment.com
19.54k stars 1.08k forks source link

Not possible to set the domain of a cookie, by using nightmare.cookies.set() #1568

Closed TheTechChild closed 5 years ago

TheTechChild commented 5 years ago

I am writing a web scraper for my boss to email him some reports from a website that he peruses every day that contains important information that he needs to know in order to make intelligent decisions about where to take our department.

The site unfortunately uses ntlm authentication to authenticate the user, so I am using the npm package httpntlm to authenticate.

In the final response I am given a sessionID in a cookie returned in the response from httpntlm that I need to set in nightmare. So I initiate a nightmare instance and go to google's website to initiate a session and I found that when I try to set the cookie, nightmare automatically sets the domain as the current domain it is sitting at.

This is my code so far: ` async function runThis() { let url = 'https://web.xipaynet.com';

let errorReceived = false; let nightmare = new Nightmare({show: true});

/ Initialize a nightmare session / console.log('initializing a nightmare session'); console.log(); await nightmare .goto('https://google.com') .cookies.clearAll();

/ Based on the response that we get from the vanilla request, we initiate an NTLM handshake / console.log('Authenticating through NTLM'); let errorAuthenticating = false; let response = await new Promise((resolve, reject) => { httpntlm.get( NTLMOptions, (err, res) => { if (err) reject(err); else resolve(res); } ); }).catch(e => { errorAuthenticating = true; return e; }); if (errorAuthenticating) return; console.log(' Authentication successful'); console.log('');

writeToFile(response, responseFilePath);

/ For Each of the Session relevant cookies from our ntlm handshake response, set a cookie / console.log('grabbing the relevant session id cookie'); let regEx = /=/, cookiesToPass = []; for (let i = 0; i < response.cookies.length; i++) { let currentCookie = response.cookies[i], match = currentCookie.match(regEx), name = currentCookie.slice(0, match.index), value = currentCookie.slice(match.index + 1) ; / This is where I would like to set the cookie's domain / await nightmare.cookies.set(name, value); } console.log();

await nightmare.cookies.get().then(cookies => { console.log('cookies:'); console.log(cookies); });

await nightmare.wait(10000); await nightmare.goto(${url}${response.headers.location}); } here is an example of the response I receive: { "headers": { "cache-control": "private", "content-type": "text/html; charset=utf-8", "location": "/Splash.aspx", "server": "Microsoft-IIS/8.5", "set-cookie": [ "SessionTimeoutWarning=1567722450696.28; path=/", "LastActivityDateTime=9/5/2019%0d%0a5:14:40 PM; path=/", "SessionTimeoutWarning=1567722450696.28; path=/", "LastActivityDateTime=9/5/2019%0d%0a5:14:40 PM; path=/", "ASP.NET_SessionId=SomeSESSIONIDWIATHSDFK2380HUWEG; path=/; HttpOnly", "PMWEBGUI=LastXIID=39133; expires=Wed, 04-Dec-2019 23:14:40 GMT; path=/", "PMPRSTTCKT=!Zictlnw+aqb+rhp0zm3oec8PLsnWci4OR3ORCgtYyHFrnMmTTq+DKj9Ep48NQcFBlRR44MkAb2HjSSg=; expires=Thu, 05-Sep-2019 22:29:41 GMT; path=/; Httponly; Secure" ], "x-aspnet-version": "4.0.30319", "persistent-auth": "true", "x-powered-by": "ASP.NET", "date": "Thu, 05 Sep 2019 22:14:40 GMT", "connection": "close", "content-length": "129" }, "statusCode": 302, "body": "body html here", "cookies": [ "SessionTimeoutWarning=1567722450696.28", "LastActivityDateTime=9/5/2019%0d%0a5:14:40 PM", "SessionTimeoutWarning=1567722450696.28", "LastActivityDateTime=9/5/2019%0d%0a5:14:40 PM", "ASP.NET_SessionId=SomeSESSIONIDWIATHSDFK2380HUWEG", "PMWEBGUI=LastXIID=39133", "PMPRSTTCKT=!Zictlnw+aqb+rhp0zm3oec8PLsnWci4OR3ORCgtYyHFrnMmTTq+DKj9Ep48NQcFBlRR44MkAb2HjSSg=" ] } `

I would either like to be able to set these options upon startup of nightmare. Or better documentation as to how to set cookies in a more complete manner, that allows me to set the domain, and any other information that I need to set.

If nightmare has a way to handle ntlm authentication, then some documentation regarding how to do that would be helpful.

TheTechChild commented 5 years ago

So I found a way around this for my current problem. I still think it would be valuable to be able to set a cookie's properties inside the cookies.set method.

That being said, I found out that chromium automatically handles ntlm authentication based on the username and password that is provided in the url string.

Also, according to the documentation here the url string is composed in a very specific format and has places for the username and password. And Chrome automatically uses the values in those fields first for authentication when authentication is required, and if those fail, then user input is sought out.

So I used the url module in nodejs to create the url string in the correct format, this way:


{ URL } = url;

username = 'someusername';
password = 'somepassword';
href = 'https://google.com';
urlObj = new URL(href);
urlObj.password = password;
urlObj.username = username;

let response = await nightmare.goto(url.format(urlObj));