Closed niels closed 8 years ago
FYI, I just verified that the "extension" is indeed looked for in the entire URL, including the host.
For example, if one wanted to block .com
files, one would actually often also block .com
domains (unless the path following that domain included a dot).
Fixed in #4.
I modified it slightly to better support non-URL references. Collector Core library being generic (not specific to the web).
Thanks for your contribution!
The current implementation of
acceptReference
will consider the last string following the last dot anywhere in the path (or perhaps even the full URL?) to be the file extension.E.g. given a URL such as https://herimedia.com/norconex-test/this.is.not.a.file/test, the file extension is detected to be
file/test
.I believe the correct implementation would only try to find a file extension within the last path segment, e.g. only within
test
.If not one else claims this ticket, I will try to submit a patch late next week. As such, this serves as a reminder to myself :)