FriendsOfPHP / Goutte

Goutte, a simple PHP Web Scraper
MIT License
9.26k stars 1.01k forks source link

Fatal error: Uncaught InvalidArgumentException: The current node list is empty. #369

Closed kayacekovic closed 4 years ago

kayacekovic commented 5 years ago

I want to login instagram with php and i want to take my profile name but when i running code i seeing this error. Can you help me ¿

<?php

require 'vendor/autoload.php';

use Goutte\Client;

$client = new Client();

$crawler = $client->request('GET', 'https://www.instagram.com/accounts/login');

$status_code = $client->getResponse()->getStatus();

if($status_code==200){
    $form = $crawler->selectButton('Log in')->form();
    $form['username'] = 'user123';
    $form['password'] = 'pass123';
    $crawler = $client->submit($form); 

    $crawler->filter('a.gmFkV')->each(function ($node ,$i) { 

        print $node->text();
        echo "<br />";
    });
}

else{
    echo "Error";
}

?>
credomane commented 4 years ago

I know this a year old and probably irrelevant to you now but I ran into the "The current node list is empty" error and managed to discover why so I thought I'd try to give you an answer that might help you.

Having looked into why I was getting this error it is actually because of a sanity check in symfony/dom-crawler. It uses a check to determine if the response type is xml or html and aborts parsing if the response type isn't one of those two. In my case the server I was talking to was returning HTML data as expected but with the text/plain content type for whatever reason. Resulting in an empty node list. I had to do some "juggling" to get it working. In your case after the line

$crawler = $client->submit($form); 

I would add the line:

$crawler->addContent($client -> getInternalResponse(), "text/html"); //or application/xml

So your little code section would look like:

$crawler = $client->submit($form); 
$crawler->addContent($client->getInternalResponse(), "text/html"); //or application/xml

$crawler->filter('a.gmFkV')->each(function ($node ,$i) { 
    print $node->text();
    echo "<br />";
});
ariclinis commented 3 years ago

@kayacekovic Hello,I guess that the problem is that Instagram don't return HTML then because this we can't get the form or anything. $client = new Client(); $crawler = $client->request('GET', 'https://www.instagram.com/accounts/login/'); $crawler->filter('body')->each(function ($node) { echo $node->text()."\n"; }); This return window._sharedData = {"config":{"csrf_token":"NpCb7lK53gbqFc1dd9SDEQsZSAa1T0ID","viewer":null,"viewerId":null},"country_code":"PT","language_code":"en","locale":"en_US","entry_data":{"LoginAndSignupPage":[{"captcha":{"enabled":false,"key":""

Where have a captcha. I tried with Facebook and don't have a problem If you found a solution i'd like kwow.