rushter / selectolax

Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
MIT License
1.09k stars 66 forks source link

examples for other functions? #120

Closed vash-knives closed 1 week ago

vash-knives commented 2 months ago

Hi there,

Can we get examples when / how to use .next, .prev? And say in a div you have 2 a tags that does not have any classes and have texts in between them. What's the best way to grab things?

eg.

<td class="no_cell you_lite">
                        <a href="https://scrapethis.com/index.php?thread=98800"><img src="https://scrapethis.com/the_post.gif" alt="My post" title="My post"></a>
                        Sun, 16 June 2024, 20:51:11
<br>
                        by <a href="https://scrapethis.com">Brand</a>
                    </td>

Thanks

rushter commented 2 months ago

You can get away with css selectors here:

parser = HTMLParser(html)
for item in parser.css('td.no_cell a'):
    print(item.text, item.attributes)
vash-knives commented 2 months ago

@rushter , nice that worked. what's the best way to grab the text on the same level (Sun, 16 June 2024, 20:51:11). i only see it when i do ...text(deep=True)

rushter commented 2 months ago

@rushter , nice that worked. what's the best way to grab the text on the same level (Sun, 16 June 2024, 20:51:11). i only see it when i do ...text(deep=True)

Just do item.next.text() for the first a tag you got from css selector.