mendableai / firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
https://firecrawl.dev
GNU Affero General Public License v3.0
19.26k stars 1.5k forks source link

feat(crawl): Similar URL deduplication #878

Closed mogery closed 2 weeks ago

mogery commented 3 weeks ago

This PR adds some important features:

nickscamara commented 2 weeks ago

Worth adding tests to the dedup stuff