temoto / robotstxt

The robots.txt exclusion protocol implementation for Go language
MIT License
269 stars 55 forks source link

Fixes #6 #7

Closed mna closed 11 years ago

mna commented 11 years ago

This is kind of a big pull request, as documented in issue #6, it fixes how the library interprets groups of rules, according to this paper of google's interpretation of robots.txt files.

I went ahead and changed parser and scanner to private, since there were many changes anyway in the implementation of the parser (scanner remained unchanged, except for the visibility). I also removed the error return value on Test[Agent] functions, since it was of little use as previously mentioned in my comment.

It adds support and features for:

I didn't squash my commits, so please feel free to get back to me if you'd like me to arrange things a little before merging. Same goes for any issue you may have with my code, I have no problem putting some more work into this to get it to fit with your style and vision. All tests are green.

Thanks!

temoto commented 11 years ago

I messed this up by merging pull request 8, wanted to try-and-see something manually and did not expect Github to implicitly close that pull request. So closing this because it can't automatically merge anyway and i'm looking at your repo closely, ready to cherry-pick.