BuilderIO / gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL
https://www.builder.io/blog/custom-gpt
ISC License
18.17k stars 1.89k forks source link

Wildcard support #7

Open evanlemmons opened 7 months ago

evanlemmons commented 7 months ago

I've noticed you can't currently use any regex when defining urls, but is there some other way to leverage different subdomains or wildcard characters?

For example, I want to crawl multiple subdomains that follow a similar structure https://pco[NAME].zendesk.com/. I was thinking I could change the match field to accept a string array, and also couldn't use regex to wildcard the [NAME] piece of the subdomain. Is there some other way to achieve this?

steve8708 commented 7 months ago

I like the idea of supporting a glob or regex, open to PRs!