BuilderIO / gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL
https://www.builder.io/blog/custom-gpt
ISC License
18.59k stars 1.97k forks source link

Adding pagination option #76

Closed kanehooper closed 10 months ago

kanehooper commented 10 months ago

Adding pagination to the crawler. This is to allow a user to configure how many pages will be crawled per pagination, and then saving this to a modified version of the output file.

The output file will have a number appended which is the paginationCounter.

This PR includes the following changes:

  1. Moved all calls to write to the crawl function to handle pagination.
  2. Update the filename generation within the write function to handle pagination.
  3. Added pagesPerPagination configuration option as optional
  4. Update the CLI commands with the pagesPerPagination option
  5. Updated the Readme file to reflect the new config option