spatie / crawler

An easy to use, powerful crawler implemented in PHP. Can execute Javascript.
https://freek.dev/308-building-a-crawler-in-php
MIT License
2.51k stars 357 forks source link

feat: custom link parser #458

Closed Velka-DEV closed 8 months ago

Velka-DEV commented 8 months ago

Feature Overview

Purpose

This update addresses the current limitation where only <a href="..."> links are discoverable. Previously, there was no support for parsing elements like sitemaps and iframes.

Modifications

Testing

freekmurze commented 8 months ago

Very nice! Could you also document how to crawl a sitemap using the new functionality? A clear example will be helpful for most users 👍

Velka-DEV commented 8 months ago

Very nice! Could you also document how to crawl a sitemap using the new functionality? A clear example will be helpful for most users 👍

I've added a quick guide on using SitemapUrlParser for sitemap crawling in the docs. I also added some tests for sitemap indexes.

Also, Happy New Year 🎉 !

freekmurze commented 8 months ago

🥳 Thanks, happy new year to you too!

freekmurze commented 8 months ago

Thanks!