Open litofunes opened 7 years ago
From: http://php.net/manual/en/function.file-get-contents.php
"The offset where the reading starts on the original stream. Negative offsets count from the end of the stream.
Seeking (offset) is not supported with remote files. Attempting to seek on non-local files may work with small offsets, but this is unpredictable because it works on the buffered stream."
HtmlDomParser::file_get_html
uses a default offset
of -1, passing in 0 should fix your problem.
But let say I want to grab a page using this:
` $file_name = file_get_contents("https://google.com");
$dom = HtmlDomParser::file_get_html($file_name);
` I will get this, not really html. how can I fetch a page as html, how can I fix this :)
file_get_contents(<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="nl"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
Hello there! May I suggest you a better way to achieve that - using cURL. I've had compatibility issues and all too, but most it got solved when I referred to cURL for pulling pages... I will suggest you some code and sources:
include('./libs/php/simple_html_dom.php'); // To use str_get_html
function request ($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
// You can add some other options too (e.g. timeout, method, etc)
$str = curl_exec($curl); // Retrieving the page as a string
$html = str_get_html($str); // Translating the string to an object
curl_close($curl); // Make sure to end your session
return $html;
}
// Save the result HTML to a variable we will later use
$dom = request("https://www.google.com");
foreach ($dom -> find("a") as $element)
echo $element->href;
hi @XTard,
First of all thanks for the reply 👍 :). I am using the HTML dom parser (https://simplehtmldom.sourceforge.io) in the first place. But I was wondering why the Laravel wrapper doesn't work as expected. I already have a local script using the simple-dom-parser, but it would be fun if it worked in Laravel
@dseegers I'm still not sure that I'm getting it right, but let me try one more time. You are using this piece of code, right?
$file_name = file_get_contents("https://google.com");
$dom = HtmlDomParser::file_get_html($file_name);
(1)The problem here is that file_get_contents
returns the file in a string. And you need to convert the string into an HTML object with str_get_html
(like cURL does in my example), but what you are doing is that you are calling file_get_html
to deal with the string. (2)Either pull the page with $dom = HtmlDomParser::file_get_html("https://www.google.com/");
or use str_get_html
instead.
This function is similar to file(), except that file_get_contents() returns the file in a string, starting at the specified offset up to maxlen bytes. On failure, file_get_contents() will return FALSE.
file_get_contents() is the preferred way to read the contents of a file into a string. It will use memory mapping techniques if supported by your OS to enhance performance.
// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');
// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');
// Create a DOM object from a HTML file
$html = file_get_html('test.htm');
(1/1) ErrorException file_get_contents(): stream does not support seeking
$html = HtmlDomParser::file_get_html('http://www.google.com/');
foreach($html->find('a') as $element) echo $element->href . '
';