crwlrsoft / crawler

Library for Rapid (Web) Crawler and Scraper Development
https://www.crwlr.software/packages/crawler
MIT License
312 stars 11 forks source link

Fix reading input sitemap in HTTP crawl step #111

Closed otsch closed 1 year ago

otsch commented 1 year ago

The Http::crawl() step now also work with sitemaps as input URL, where the <urlset> tag contains attributes that would cause the symfony DomCrawler to not find any elements.