spekulatius / PHPScraper

A universal web-util for PHP.
https://phpscraper.de
GNU General Public License v3.0
520 stars 73 forks source link

How get all element with class? #73

Closed Kkiomen closed 1 year ago

Kkiomen commented 2 years ago

How can I get all element with class for example "card_header_f3"?

gcijuentes commented 2 years ago

I have the same question :(

spekulatius commented 2 years ago

Hello @Kkiomen and @gcijuentes,

sorry for the late reply.

Have you tried the filterXPath method? It should allow you to simply filter by any class name using an xPath like $myClassElements = $web->filterXPath("//[@class='my-class']");.

Cheers, Peter

robertgarrigos commented 1 year ago

While trying it, I'm getting this error:

Call to undefined method spekulatius\core::filterXPath()

I just installed PHPScrapper (0.6.2) with Composer and the first example of getting a website's title worked fine.

What am I missing?

spekulatius commented 1 year ago

Hey @robertgarrigos

Oh sorry, I mixed up the naming with the underlying package. It's filter instead of filterXPath. filterXPath is used in the DOM crawler package: https://github.com/symfony/dom-crawler/blob/8cb4c6e6c8d30c26f70529ed5e50d79a09576c0c/Crawler.php#L686

Please try again with filter. CC @Kkiomen and @gcijuentes

Cheers, Peter

robertgarrigos commented 1 year ago

Still not working:

Warning: DOMXPath::query(): Invalid expression in /app/vendor/symfony/dom-crawler/Crawler.php on line 1013 Fatal error: Uncaught InvalidArgumentException: Expecting a DOMNodeList or DOMNode instance, an array, a string, or null, but got "bool". in /app/vendor/symfony/dom-crawler/Crawler.php:145 Stack trace: #0 /app/vendor/symfony/dom-crawler/Crawler.php(1013): Symfony\Component\DomCrawler\Crawler->add(false) #1 /app/vendor/symfony/dom-crawler/Crawler.php(771): Symfony\Component\DomCrawler\Crawler->filterRelativeXPath('descendant-or-s...') #2 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(165): Symfony\Component\DomCrawler\Crawler->filterXPath('descendant-or-s...') #3 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(60): spekulatius\core->filter('//[@class='pros...') #4 /app/phpscraper.php(11): spekulatius\phpscraper->__call('filter', Array) #5 {main} thrown in /app/vendor/symfony/dom-crawler/Crawler.php on line 145

spekulatius commented 1 year ago

Can you share the URL and query?

On Thu, Sep 15, 2022, 19:10 Robert Garrigos @.***> wrote:

Still not working:

Warning: DOMXPath::query(): Invalid expression in /app/vendor/symfony/dom-crawler/Crawler.php on line 1013 Fatal error: Uncaught InvalidArgumentException: Expecting a DOMNodeList or DOMNode instance, an array, a string, or null, but got "bool". in /app/vendor/symfony/dom-crawler/Crawler.php:145 Stack trace: #0 /app/vendor/symfony/dom-crawler/Crawler.php(1013): Symfony\Component\DomCrawler\Crawler->add(false) #1 /app/vendor/symfony/dom-crawler/Crawler.php(771): Symfony\Component\DomCrawler\Crawler->filterRelativeXPath('descendant-or-s...')

2 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(165):

Symfony\Component\DomCrawler\Crawler->filterXPath('descendant-or-s...') #3 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(60): @.***='pros...') #4 /app/phpscraper.php(11): spekulatius\phpscraper->__call('filter', Array) #5 {main} thrown in /app/vendor/symfony/dom-crawler/Crawler.php on line 145

— Reply to this email directly, view it on GitHub https://github.com/spekulatius/PHPScraper/issues/73#issuecomment-1248311868, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAK7M32AWBFZZS6RE5RCR3V6NC67ANCNFSM5WQQJQFQ . You are receiving this because you commented.Message ID: @.***>

robertgarrigos commented 1 year ago
require __DIR__ . '/vendor/autoload.php';

$web = new \spekulatius\phpscraper;

$web->go('https://www.lieder.net/lieder/get_settings.html?ComposerId=2520');

// print_r($web->title);

$myClassElements = $web->filter("//[@class='prose']");

print_r($myClassElements);
spekulatius commented 1 year ago

Hey @robertgarrigos

I can replicate the problem. It looks as if the error comes from the DOM crawler, not PHPScraper itself. The xPath could use some tweaking:

$myClassElements = $web->filter("//*[@class='prose']");

with ->text() you should get the text of the sub-nodes:

$myClassElements = $web->filter("//*[@class='prose']")->text();

I've also tried to use other PHPScraper built-in selectors and they worked. The $web->lists for example returns the lists as expected.

I hope this helps, Peter

spekulatius commented 1 year ago

Hey everyone,

I've added a page to document the way custom selectors can be used: https://phpscraper.de/examples/custom-selectors.html

There are also some new tests for this: https://github.com/spekulatius/PHPScraper/blob/master/tests/CustomSelectorTest.php

Please let me know if you think anything is missing.

Cheers, Peter