spekulatius / PHPScraper

A universal web-util for PHP.
https://phpscraper.de
GNU General Public License v3.0
515 stars 73 forks source link

[Proposal] Add scraping API support #122

Closed nathabonfim59 closed 1 year ago

nathabonfim59 commented 1 year ago

Motivation

I saw some references to an API service in the documentation. I don't know if you plan to make the implementation open-source as well, but I really liked the idea of abstracting the extraction process.

Combined with the proxy feature, it provides an incredibly powerful tool for gathering data.

Proposal

The idea is to implement a setApi method in the phpscraper class that supports various APIs. I went with the "namespace builder" approach, to load the API code as needed.

public $api = null;
...
public function setApi($api)
{
    $apiClass = __NAMESPACE__ . '\apis\\' . $api;

    $this->api = new $apiClass($this->core);

    return $this;
}

Then we will just need to create a file inside the new apis folder with the corresponding implementation, inside the namespace.

src/apis/example_api.php


namespace spekulatius\apis;

class example_api { protected $core = null;

public function __construct(core &$core)
{
    $this->core = $core;
}

... }


### Implementation example
I've implemented an API to scrap products from Mercado Libre ([the biggest](https://www.webretailer.com/b/online-marketplaces-latin-america/) online marketplace in Latin America).

```php
$web = new phpscraper;
$web->setApi('mercado_libre');

$web->go('https://mercadolivre.com.br/product_url');
$productData = $web->api->getProduct();

More details about the api mercado libre API here.


What do you think?

spekulatius commented 1 year ago

Hey @nathabonfim59,

I'm working on the feature using a bit a different approach already :+1: I'll keep you updated on how it goes.

Cheers, Peter