WebCuratorTool / webcurator

The root of the webcurator tool project, containing all modules needed to run a fully functional webcurator tool.
Apache License 2.0
2 stars 1 forks source link

Block URLs in profile (overrides) no longer support full regex language #79

Closed hannakoppelaar closed 1 year ago

hannakoppelaar commented 1 year ago

If you're using a block URL regex like this

^[^/]+://[^/]*(google).*

WCT (as of version 3.1) will complain that "each star ( * ) in the url pattern must start with a dot", whereas Heritrix is totally fine with it, since it supports the full Perl regex language.

This validation is unnecessarily limiting and will cause problems for people using more advanced block patterns when they upgrade to 3.1.