j0k3r / graby

Graby helps you extract article content from web pages
MIT License
362 stars 73 forks source link

Cannot install with composer #320

Closed robertandrews closed 1 year ago

robertandrews commented 1 year ago

README.md says to use composer require j0k3r/graby php-http/guzzle7-adapter, but this results in...

robert@Roberts-iMac content-grabber % composer require j0k3r/graby php-http/guzzle7-adapter
Info from https://repo.packagist.org: #StandWithUkraine
Cannot use j0k3r/graby's latest version 2.4.4 as it requires ext-tidy * which is missing from your platform.
./composer.json has been created
Running composer update j0k3r/graby php-http/guzzle7-adapter
Loading composer repositories with package information
Updating dependencies
Your requirements could not be resolved to an installable set of packages.

  Problem 1
    - Conclusion: don't install php-http/guzzle7-adapter 1.0.0 (conflict analysis result)
    - Conclusion: don't install j0k3r/graby 1.12.0 (conflict analysis result)
    - Conclusion: don't install j0k3r/graby 1.15.2 (conflict analysis result)
    - Conclusion: don't install j0k3r/graby 1.20.1 (conflict analysis result)
    - j0k3r/graby 1.0.0 requires fin1te/safecurl dev-master -> found fin1te/safecurl[dev-master] but it does not match your minimum-stability.
    - j0k3r/graby[2.0.0, ..., 2.4.4] require ext-tidy * -> it is missing from your system. Install or enable PHP's tidy extension.
    - Root composer.json requires php-http/guzzle7-adapter * -> satisfiable by php-http/guzzle7-adapter[0.1.0, 0.1.1, 1.0.0].
    - Conclusion: don't install one of guzzlehttp/guzzle[5.3.1], php-http/guzzle7-adapter[0.1.0] | install guzzlehttp/guzzle[7.5.0] (conflict analysis result)
    - Conclusion: don't install guzzlehttp/guzzle 7.5.0 (conflict analysis result)
    - Conclusion: don't install one of guzzlehttp/guzzle[5.3.4], php-http/guzzle7-adapter[0.1.0] (conflict analysis result)
    - j0k3r/graby[1.0.1, ..., 1.10.1] require guzzlehttp/guzzle ^5.2.0 -> satisfiable by guzzlehttp/guzzle[5.2.0, ..., 5.3.4].
    - Conclusion: don't install guzzlehttp/guzzle 5.2.0 (conflict analysis result)
    - Root composer.json requires j0k3r/graby * -> satisfiable by j0k3r/graby[1.0.0, ..., 1.20.1, 2.0.0, ..., 2.4.4].

To enable extensions, verify that they are enabled in your .ini files:
    - /Applications/MAMP/bin/php/php7.4.33/conf/php.ini
You can also run `php --ini` in a terminal to see which files are used by PHP in CLI mode.
Alternatively, you can run Composer with `--ignore-platform-req=ext-tidy` to temporarily ignore these required extensions.
You can also try re-running composer require with an explicit version constraint, e.g. "composer require j0k3r/graby:*" to figure out if any version is installable, or "composer require j0k3r/graby:^2.1" if you know which you need.

Installation failed, deleting ./composer.json.

NB. I am running locally on MAMP Pro, with PHP 7.4.33.

which php appears to confirm it is MAMP's PHP which is being used - /Applications/MAMP/bin/php/php7.4.33/bin/php

php --ini shows:

Configuration File (php.ini) Path: /Applications/MAMP/bin/php/php7.4.33/conf
Loaded Configuration File:         /Applications/MAMP/bin/php/php7.4.33/conf/php.ini

tidy is enabled - libTidy 5.6.0

cURL is enabled.

jtojnar commented 1 year ago

How did you verify that tidy is enabled?

Can you try running

php -i | grep -i tidy

Please also try running composer diagnose to check if it uses the same PHP.

robertandrews commented 1 year ago

Tidy is checked in MAMP's app interface, the corresponding php7.4.33.ini shows...

[Tidy]
; The path to a default tidy configuration file to use when using tidy
; http://php.net/tidy.default-config
;tidy.default_config = /usr/local/lib/php/default.tcfg

; Should tidy clean and repair output automatically?
; WARNING: Do not use this option if you are generating non-html content
; such as dynamic images
; http://php.net/tidy.clean-output
tidy.clean_output = Off

... and also ... MAMP_Tidy_MAMPextension=tidy.so

phpinfo shows:

Screenshot 2023-04-11 at 19 18 05

php -i | grep -i tidy results in...

Configure Command =>  './configure'  '--with-apxs2=/Applications/MAMP/Library/bin/apxs' '--with-zlib' '--with-zlib-dir=/Applications/MAMP/Library' '--prefix=/Applications/MAMP/bin/php/php7.4.33' '--exec-prefix=/Applications/MAMP/bin/php/php7.4.33' '--sysconfdir=/Applications/MAMP/bin/php/php7.4.33/conf' '--with-config-file-path=/Applications/MAMP/bin/php/php7.4.33/conf' '--enable-ftp' '--with-bz2=/Applications/MAMP/Library' '--with-mysqli=mysqlnd' '--enable-mbstring=all' '--with-curl=/Applications/MAMP/Library' '--enable-sockets' '--enable-bcmath' '--enable-soap' '--enable-calendar' '--with-pgsql=shared,/Applications/MAMP/Library/pg' '--enable-exif' '--with-gettext=shared,/Applications/MAMP/Library' '--with-xsl=/Applications/MAMP/Library' '--with-pdo-mysql=mysqlnd' '--with-pdo-pgsql=shared,/Applications/MAMP/Library/pg' '--with-openssl=/Applications/MAMP/Library' '--with-iconv=/Applications/MAMP/Library' '--enable-opcache' '--enable-intl' '--with-tidy=shared,/Applications/MAMP/Library' '--with-readline' '--with-mhash' '--with-iconv-dir=/Applications/MAMP/Library' '--with-sodium=/Applications/MAMP/Library' '--with-password-argon2=/Applications/MAMP/Library' '--with-zip' '--with-xmlrpc' '--with-kerberos' '--with-pdo-sqlite' '--with-sqlite3' '--with-ldap=/Applications/MAMP/Library' '--with-ldap-sasl' '--with-imap=shared,/Applications/MAMP/Library/lib/imap-2007f/lib' '--with-imap-ssl=/Applications/MAMP/Library' '--disable-phpdbg' '--enable-cgi' '--enable-gd' '--with-webp' '--with-jpeg' '--with-freetype' '--enable-pcntl' 'KERBEROS_CFLAGS=-I/usr/include' 'KERBEROS_LIBS=-lkrb5' 'SQLITE_CFLAGS= ' 'SQLITE_LIBS=-lsqlite3' 'JPEG_CFLAGS= ' 'JPEG_LIBS=-ljpeg' 'SASL_CFLAGS=-I/usr/include/sasl' 'SASL_LIBS=-lsasl2'

composer diagnose results in...


Checking platform settings: OK
Checking git settings: WARNING
Your git version (2.15.0) is too old and possibly will cause issues. Please upgrade to git 2.24 or above
Checking http connectivity to packagist: OK
Checking https connectivity to packagist: OK
Checking github.com rate limit: OK
Checking disk free space: OK
Checking pubkeys: 
Tags Public Key Fingerprint: [removed]
Dev Public Key Fingerprint: [removed]
OK
Checking composer version: OK
Composer version: 2.5.5
PHP version: 7.4.33
PHP binary path: /Applications/MAMP/bin/php/php7.4.33/bin/php
OpenSSL version: OpenSSL 1.0.2u  20 Dec 2019
cURL version: 7.68.0 libz 1.2.11 ssl OpenSSL/1.0.2u
zip: extension present, unzip present, 7-Zip not available```
jtojnar commented 1 year ago

php -i | grep -i tidy results in...

Right, that sounds like a different config is used for CLI than for the web server.

When properly configured, it should print something like the following:

$ php -i | grep -i tidy
tidy
Tidy support => enabled
libTidy Version => 5.8.0
tidy.clean_output => 0 => 0
tidy.default_config => no value => no value

You will need to change the config. You can pass --ignore-platform-req=ext-tidy flag to Composer to pretend it is available but then you would encounter issues if you ever wanted to run Graby from CLI.

robertandrews commented 1 year ago

Thanks for the swift help.

With the override, it installed as below (lots of dependencies).

But I'm also struggling to get it working. With this...

require __DIR__ . '/vendor/autoload.php';
use Graby\Graby;

... and ...

$article = 'http://www.bbc.com/news/entertainment-arts-32547474';
$graby = new Graby();

$result = $graby->fetchContent($article);
var_dump($result->getStatus()); // 200
var_dump($result->getHtml()); // "[Fetched and readable content…]"
var_dump($result->getTitle()); // "Ben E King: R&B legend dies at 76"
var_dump($result->getLanguage()); // "en-GB"

... I get...

[11-Apr-2023 18:37:41 UTC] PHP Fatal error:  Uncaught Error: Call to a member function getStatus() on array in /Users/robert/Dropbox/Websites/context.local/wp-content/plugins/content-grabber/content-grabber.php:104
Stack trace:
#0 /Users/robert/Dropbox/Websites/context.local/wp-includes/class-wp-hook.php(308): grab_article('')
#1 /Users/robert/Dropbox/Websites/context.local/wp-includes/class-wp-hook.php(332): WP_Hook->apply_filters('', Array)
#2 /Users/robert/Dropbox/Websites/context.local/wp-includes/plugin.php(517): WP_Hook->do_action(Array)
#3 /Users/robert/Dropbox/Websites/context.local/wp-admin/admin-post.php(85): do_action('admin_post_grab...')
#4 {main}
  thrown in /Users/robert/Dropbox/Websites/context.local/wp-content/plugins/content-grabber/content-grabber.php on line 104

104 is the first getStatus line.

If I use...

$result = $graby->fetchContent($article);
            if (is_object($result)) {
                var_dump($result->getStatus()); // 200
                var_dump($result->getHtml()); // "[Fetched and readable content…]"
                var_dump($result->getTitle()); // "Ben E King: R&B legend dies at 76"
                var_dump($result->getLanguage()); // "en-GB"
            } else {
                var_dump($result);
            }

... it's the full var_dump pathway that is taken; it dumps the full $result, as though $result is not an object.

robert@Roberts-iMac content-grabber % composer require j0k3r/graby php-http/guzzle7-adapter --ignore-platform-req=ext-tidy
Info from https://repo.packagist.org: #StandWithUkraine
./composer.json has been created
Running composer update j0k3r/graby php-http/guzzle7-adapter
Loading composer repositories with package information
Updating dependencies
Lock file operations: 33 installs, 0 updates, 0 removals
  - Locking clue/stream-filter (v1.6.0)
  - Locking fossar/htmlawed (1.3.1)
  - Locking guzzlehttp/guzzle (7.5.0)
  - Locking guzzlehttp/promises (1.5.2)
  - Locking guzzlehttp/psr7 (2.4.4)
  - Locking http-interop/http-factory-guzzle (1.2.0)
  - Locking j0k3r/graby (2.4.4)
  - Locking j0k3r/graby-site-config (1.0.165)
  - Locking j0k3r/httplug-ssrf-plugin (v2.0.2)
  - Locking j0k3r/php-readability (1.2.10)
  - Locking masterminds/html5 (2.7.6)
  - Locking monolog/monolog (2.9.1)
  - Locking php-http/client-common (2.6.0)
  - Locking php-http/discovery (1.15.3)
  - Locking php-http/guzzle7-adapter (1.0.0)
  - Locking php-http/httplug (2.3.0)
  - Locking php-http/message (1.13.0)
  - Locking php-http/message-factory (v1.0.2)
  - Locking php-http/promise (1.1.0)
  - Locking psr/http-client (1.0.2)
  - Locking psr/http-factory (1.0.2)
  - Locking psr/http-message (1.1)
  - Locking psr/log (1.1.4)
  - Locking ralouphie/getallheaders (3.0.3)
  - Locking simplepie/simplepie (1.8.0)
  - Locking smalot/pdfparser (v1.1.0)
  - Locking symfony/deprecation-contracts (v2.5.2)
  - Locking symfony/finder (v5.4.21)
  - Locking symfony/options-resolver (v5.4.21)
  - Locking symfony/polyfill-mbstring (v1.27.0)
  - Locking symfony/polyfill-php73 (v1.27.0)
  - Locking symfony/polyfill-php80 (v1.27.0)
  - Locking true/punycode (v2.1.1)
Writing lock file
Installing dependencies from lock file (including require-dev)
Package operations: 33 installs, 0 updates, 0 removals
  - Downloading php-http/discovery (1.15.3)
  - Downloading guzzlehttp/promises (1.5.2)
  - Downloading symfony/polyfill-mbstring (v1.27.0)
  - Downloading true/punycode (v2.1.1)
  - Downloading symfony/polyfill-php80 (v1.27.0)
  - Downloading symfony/polyfill-php73 (v1.27.0)
  - Downloading symfony/deprecation-contracts (v2.5.2)
  - Downloading symfony/options-resolver (v5.4.21)
  - Downloading smalot/pdfparser (v1.1.0)
  - Downloading simplepie/simplepie (1.8.0)
  - Downloading psr/http-message (1.1)
  - Downloading php-http/message-factory (v1.0.2)
  - Downloading clue/stream-filter (v1.6.0)
  - Downloading php-http/message (1.13.0)
  - Downloading psr/http-client (1.0.2)
  - Downloading php-http/promise (1.1.0)
  - Downloading php-http/httplug (2.3.0)
  - Downloading psr/http-factory (1.0.2)
  - Downloading php-http/client-common (2.6.0)
  - Downloading psr/log (1.1.4)
  - Downloading monolog/monolog (2.9.1)
  - Downloading masterminds/html5 (2.7.6)
  - Downloading j0k3r/php-readability (1.2.10)
  - Downloading j0k3r/httplug-ssrf-plugin (v2.0.2)
  - Downloading symfony/finder (v5.4.21)
  - Downloading j0k3r/graby-site-config (1.0.165)
  - Downloading ralouphie/getallheaders (3.0.3)
  - Downloading guzzlehttp/psr7 (2.4.4)
  - Downloading http-interop/http-factory-guzzle (1.2.0)
  - Downloading fossar/htmlawed (1.3.1)
  - Downloading j0k3r/graby (2.4.4)
  - Downloading guzzlehttp/guzzle (7.5.0)
  - Downloading php-http/guzzle7-adapter (1.0.0)
php-http/discovery contains a Composer plugin which is currently not in your allow-plugins config. See https://getcomposer.org/allow-plugins
Do you trust "php-http/discovery" to execute code and wish to enable it now? (writes "allow-plugins" to composer.json) [y,n,d,?] y
  - Installing php-http/discovery (1.15.3): Extracting archive
  - Installing guzzlehttp/promises (1.5.2): Extracting archive
  - Installing symfony/polyfill-mbstring (v1.27.0): Extracting archive
  - Installing true/punycode (v2.1.1): Extracting archive
  - Installing symfony/polyfill-php80 (v1.27.0): Extracting archive
  - Installing symfony/polyfill-php73 (v1.27.0): Extracting archive
  - Installing symfony/deprecation-contracts (v2.5.2): Extracting archive
  - Installing symfony/options-resolver (v5.4.21): Extracting archive
  - Installing smalot/pdfparser (v1.1.0): Extracting archive
  - Installing simplepie/simplepie (1.8.0): Extracting archive
  - Installing psr/http-message (1.1): Extracting archive
  - Installing php-http/message-factory (v1.0.2): Extracting archive
  - Installing clue/stream-filter (v1.6.0): Extracting archive
  - Installing php-http/message (1.13.0): Extracting archive
  - Installing psr/http-client (1.0.2): Extracting archive
  - Installing php-http/promise (1.1.0): Extracting archive
  - Installing php-http/httplug (2.3.0): Extracting archive
  - Installing psr/http-factory (1.0.2): Extracting archive
  - Installing php-http/client-common (2.6.0): Extracting archive
  - Installing psr/log (1.1.4): Extracting archive
  - Installing monolog/monolog (2.9.1): Extracting archive
  - Installing masterminds/html5 (2.7.6): Extracting archive
  - Installing j0k3r/php-readability (1.2.10): Extracting archive
  - Installing j0k3r/httplug-ssrf-plugin (v2.0.2): Extracting archive
  - Installing symfony/finder (v5.4.21): Extracting archive
  - Installing j0k3r/graby-site-config (1.0.165): Extracting archive
  - Installing ralouphie/getallheaders (3.0.3): Extracting archive
  - Installing guzzlehttp/psr7 (2.4.4): Extracting archive
  - Installing http-interop/http-factory-guzzle (1.2.0): Extracting archive
  - Installing fossar/htmlawed (1.3.1): Extracting archive
  - Installing j0k3r/graby (2.4.4): Extracting archive
  - Installing guzzlehttp/guzzle (7.5.0): Extracting archive
  - Installing php-http/guzzle7-adapter (1.0.0): Extracting archive
18 package suggestions were added by new dependencies, use `composer suggest` to see details.
Package true/punycode is abandoned, you should avoid using it. No replacement was suggested.
Generating autoload files
13 packages you are using are looking for funding.
Use the `composer fund` command to find out more!
No security vulnerability advisories found
Using version ^2.4 for j0k3r/graby
Using version ^1.0 for php-http/guzzle7-adapter
robert@Roberts-iMac content-grabber % 
jtojnar commented 1 year ago

The README is for master branch, which has a different API. You need to look at https://github.com/j0k3r/graby/tree/2.4.4#how-to-use-it

robertandrews commented 1 year ago

Ah - so, the version on master is newer and benefits from the get* commands? I think I saw you tell someone the way to get that one is to clone the repo, is that right? Overwriting vendor/j0k3r/graby ?

jtojnar commented 1 year ago

That would be one way, though rather than overwriting it, you would use a change composer.json to point to local checkout in a different directory so that your changes to the graby repo do not get accidentally overwritten when you run composer update:

composer config repositories.local '{"type": "path", "url": "../graby/"}' --file composer.json
composer config minimum-stability 'dev' --file composer.json
composer require 'j0k3r/graby @dev' --with-all-dependencies

Or you could just use composer require 'j0k3r/graby dev-master' php-http/guzzle7-adapter to install it from the upstream repo.

Though note that this is a development version and the API will likely change: #275

robertandrews commented 1 year ago

Solved.