Open LeMoussel opened 7 years ago
At the moment
Disallow: /Example/*
and
Disallow: /Example/
Will be transform into regexp
/^\/Example\/.*/
and
/^\/Example\//
which will equally work eventually (the end of a string).
So:
Disallow: /Example/*
& Disallow: /Example/
are similar. Eg Disallow: /Example/*
== Disallow: /Example/
I am not a specialist in regexp but is the behavior of /^/Example/.*/
is different from /^/Example//
?
My point of view for a better understanding, I think that it would be preferable to remove asterisk (*) at the end of path => Get same regexp.
It' a good idea to Log message about redundant asterisk at the end of path (for user information only.
^/Example
^/Example.*
These both regexps will work identical. We can do it, but it will not have effect. We can log message for user information. Also here https://github.com/bopoda/robots-txt-parser/issues/9.
Search engine allow an asterisk (*) to match any sequence of characters, and a dollar sign ($) to match the end of the URL. So, to block spiders from downloading any JPEG image files, one might use:
Indeed, blocking spidering of certain file types is the most popular use for wildcards. Most people who are using wildcards for anything else are doing so entirely unnecessarily. For example, a lot of sites have the following rule:
The use of the non-standard wildcard above is useless, as this rule is equivalent to:
This is because rules are by default partial paths, and will match any path beginning with that string.
The feature would be to remove asterisk (*) at the end of path and log message to indicate the error.