FriendsOfPHP / Goutte

Goutte, a simple PHP Web Scraper
MIT License
9.26k stars 1.01k forks source link

Cannot extract data from iframe #270

Open lincolnaleixo opened 8 years ago

lincolnaleixo commented 8 years ago

Hey,

I want to extract information from a site that has content inside an iframe, I tried to copy the XPath but didn't return anything. Is there any way I can get this content?

Regards

thebennos commented 8 years ago

get the url of the iframe and load it directly.

lincolnaleixo commented 8 years ago

There's no url on the iframe

serdarozturk commented 8 years ago

If iframe has srcdoc element you can get data with this code.

$doc = new DOMDocument();
$doc->loadHTML($yoursiteHTML);
foreach($doc->getElementsByTagName('iframe') as $link) {
    echo $link->getAttribute('srcdoc');
}

If you want to get you can get data with this code.

$doc = new DOMDocument();
$doc->loadHTML($yoursiteHTML);
foreach($doc->getElementsByTagName('iframe') as $link) {
    echo $link->nodeValue;
}

but if iframe content is loading from external source, it does not work.