Closed alexander-bauer closed 11 years ago
Hello. Thanks for your interest in this package. And thank you for providing a code snippet to reproduce the problem.
Now if you add println("Response:"); resp.Write(os.Stdout)
after http.Get and err check, you will see that response is actually 200 OK
, this is not redirect. http.Get followed redirect silently and downloaded the final page.
Maybe FindGroup returning nil is a problem indeed. What would you expect?
Having browsed the code slightly, it looks as if FromBytes()
returns a nil under certain circumstances when it fails to parse. I believe that, given these circumstances, (no error message, but an invalid file,) one of two things should happen: either FromResponse()
should return nil, err
, where err is caught by the if err != nil {}
line, or it should return allowAll, err
.
If the former approach is taken, then allowAll
and disallowAll
should be made public; that would simplify the error handling.
//...
//Let's say this returns (nil, err)
robots, err := robotstxt.FromResponse(resp)
if err != nil {
return robotstxt.AllowAll
}
return robots
Complexity here is that robots.txt is defined pretty loosely, i mean many arbitrary texts can be considered a valid robots.txt file with no rules. I will look into this particular file to determine why it does not return an error.
If you are in big hurry, FromString("") is a shortcut for AllowAll, but yeah, i guess they could be made public.
I'm sorry, i don't have much spare time right now, it may wait until weekend.
@SashaCrofter please see this problem is fixed now. I understand this is not how you wanted it to be fixed, but this is required to be consistent with Google specification. robotstxt.go is compatible with Google's approach since issue #6 "invalid content = no errors, everything is allowed".
I've been working on a project which makes use of this library. I thank you greatly for its usefulness. However, I've been encountering a problem. example.com/robots.txt is, as it turns out, a redirect. Rather than throwing an error, FromResponse() seems to produce an object from which FindGroup().Test() causes a runtime panic.
To illustrate: