I'm trying to crawl a large number of pages (100,000) and the max number of pages I set is being exceeded, it is curertly at 130,00 pages or so, b/c more than 100k pages fit the url (/**) I put in the conditions.
Is there a feature to stop the crawling process without losing the pages already crawled?
Also, what alternatives exist for training such any any model on such a large data set? Assume this, when finished, will certainly exceed the limit for OpenAi's assistant and the GPT creator.
Hi, awesome tool.
I'm trying to crawl a large number of pages (100,000) and the max number of pages I set is being exceeded, it is curertly at 130,00 pages or so, b/c more than 100k pages fit the url (/**) I put in the conditions.
Is there a feature to stop the crawling process without losing the pages already crawled?
Also, what alternatives exist for training such any any model on such a large data set? Assume this, when finished, will certainly exceed the limit for OpenAi's assistant and the GPT creator.
Thanks.