Closed jannescb closed 3 years ago
The name parser
does not match the intention of the interface. A parser returns the modified value of a parameter. The name should be something like HtmlLoader
. Also, the class name of the implentation should give an idea of what the implementation looks like. A good name for the implemented loader would be FileContentHtmlLoader
. So the interface could look like this:
interface HtmlLoader
{
/**
* Load the html content from the given url.
*
* @param string $url
* @return string
*/
public function load($url);
}
And the implementation:
class FileContentHtmlLoader implements HtmlLoader
{
// ...
}
The implementation from the pr description:
class BrowsershotHtmlLoader implements HtmlLoader
{
// ...
}
This PR enables using a custom method for parsing the html of a URL. This might be useful for client-side-rendered pages.
In the
config/indexer.php
theurl_parser
may be changed.This should be a fairly simple class with a
getHtml
method.You could for example use the Spatie package Browsershot for parsing URL:
With this feature we could solve this issue.