BuilderIO / gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL
https://www.builder.io/blog/custom-gpt
ISC License
18.59k stars 1.97k forks source link

[FR] Exclude a list of urls #78

Open Snowzer91 opened 10 months ago

Snowzer91 commented 10 months ago

Can you add a feature to ignore crawling for a given list of urls?

Sample: match: [ "https://www.builder.io/", ], exclude: [ "https://www.builder.io/blog/", ]

miticojo commented 8 months ago

In that way should works

  match: [
    "https://www.builder.io/**",
  ],
  exclude: ["**/blog/**"],