c4software / python-sitemap

Mini website crawler to make sitemap from a website.
GNU General Public License v3.0
362 stars 110 forks source link

More robust link regex #24

Closed Garrett-R closed 7 years ago

Garrett-R commented 7 years ago

This is a very minor, but this allows line breaks inside tags (which is valid HTML).

For example, this will now be matched properly:

<a href="www.example.com"
> hello </a>
c4software commented 7 years ago

Nice catch. Thanks for the modification.