BuilderIO / gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL
https://www.builder.io/blog/custom-gpt
ISC License
18.16k stars 1.88k forks source link

feat: use Xpath as selector #45

Closed LeonKohli closed 7 months ago

LeonKohli commented 7 months ago

Updated getPageHtml Function: The function is now equipped to discern between XPath and CSS selectors. It properly evaluates XPath expressions, addressing issues where the crawler previously failed to process XPath selectors.

XPath Handling in requestHandler: The request handler now contains logic to distinguish XPath selectors from CSS selectors. This addition ensures the crawler appropriately waits for and handles elements located via XPath.

waitForXPath Function: A new function that waits specifically for XPath elements, improving the crawler's robustness and its ability to handle dynamic content loaded via JavaScript.

steve8708 commented 7 months ago

looks great, thanks @LeonKohli !

github-actions[bot] commented 7 months ago

:tada: This PR is included in version 1.0.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: