Open vedmant opened 2 years ago
@vedmant Great suggestion. If you are up to take this up, feel free to open a PR. We can add one more function something like OpenGraph::fetchFromHtml($html) that will accept html and return OG data. This should be easy.
public function fetchFromHtml($html, $allMeta = null, $lang = null, $options = LIBXML_NOWARNING | LIBXML_NOERROR, $userAgent = 'Curl')
{
/**
* parsing starts here:.
*/
$doc = new DOMDocument();
$libxml_previous_state = libxml_use_internal_errors(true);
$doc->loadHTML('<?xml encoding="utf-8" ?>'.$html, $options);
//catch possible errors due to empty or malformed HTML
if ($options > 0 && ($options & (LIBXML_NOWARNING | LIBXML_NOERROR)) == 0) {
Log::warning(libxml_get_errors());
}
libxml_clear_errors();
// restore previous state
libxml_use_internal_errors($libxml_previous_state);
$tags = $doc->getElementsByTagName('meta');
$metadata = [];
foreach ($tags as $tag) {
$metaproperty = ($tag->hasAttribute('property')) ? $tag->getAttribute('property') : $tag->getAttribute('name');
if (!$allMeta && $metaproperty && strpos($tag->getAttribute('property'), 'og:') === 0) {
$key = strtr(substr($metaproperty, 3), '-', '_');
$value = $this->get_meta_value($tag);
}
if ($allMeta && $metaproperty) {
$key = (strpos($metaproperty, 'og:') === 0) ? strtr(substr($metaproperty, 3), '-', '_') : $metaproperty;
$value = $this->get_meta_value($tag);
}
if (!empty($key)) {
$metadata[$key] = $value;
}
/*
* Verify image url
*/
if (isset($metadata['image'])) {
$isValidImageUrl = $this->verify_image_url($metadata['image']);
if (!$isValidImageUrl) {
$metadata['image'] = '';
}
}
}
return $metadata;
}
It can reuse the same functions used by fetch method.
In my case I fetch page html for further use (searching for other data) but also need to get open graph data, I could use this package if it allowed to pass html string instead of url.