crwlrsoft / crawler

Library for Rapid (Web) Crawler and Scraper Development
https://www.crwlr.software/packages/crawler
MIT License
312 stars 11 forks source link

Ignore special non HTTP links #112

Closed otsch closed 1 year ago

otsch commented 1 year ago

The Http::crawl() step, as well as the Html::getLink() and Html::getLinks() steps now ignore links, when the href attribute starts with mailto:, tel: or javascript:.